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Abstract 

LINE-1 (L1) retrotransposons are repetitive elements in mammalian genomes. They are capable of synthesizing DNA 
on their own RNA templates by harnessing reverse transcriptase (RT) that they encode. Abundantly expressed full- 
length Lis and their RT are found to globally influence gene expression profiles, differentiation state, and 
proliferation capacity of early embryos and many types of cancer, albeit by yet unknown mechanisms. They are 
essential for the progression of early development and the establishment of a cancer-related undifferentiated state. 
This raises important questions regarding the functional significance of L1 RT in these cell systems. Massive nuclear 
L1 -linked reverse transcription has been shown to occur in mouse zygotes and two-cell embryos, and this 
phenomenon is purported to be DNA replication independent. This review argues against this claim with the goal 
of understanding the nature of this phenomenon and the role of L1 RT in early embryos and cancers. Available L1 
data are revisited and integrated with relevant findings accumulated in the fields of replication timing, chromatin 
organization, and epigenetics, bringing together evidence that strongly supports two new concepts. First, 
noncanonical replication of a portion of genomic full-length Lis by means of L1 RNP-driven reverse transcription is 
proposed to co-exist with DNA polymerase-dependent replication of the rest of the genome during the same 
round of DNA replication in embryonic and cancer cell systems. Second, the role of this mechanism is thought to 
be epigenetic; it might promote transcriptional competence of neighboring genes linked to undifferentiated states 
through the prevention of tethering of involved Lis to the nuclear periphery. From the standpoint of these 
concepts, several hitherto inexplicable phenomena can be explained. Testing methods for the model are proposed. 
Reviewers: This article was reviewed by Dr. Philip Zegerman (nominated by Dr. Orly Alter), Dr. I. King Jordan, and 
Dr. Panayiotis (Takis) Benos. For the complete reviews, see the Reviewers' Reports section. 
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Review 

Introduction 

LI elements have propagated in mammalian genomes 
by means of autonomous retrotransposition. Retro- 
transposition of an LI element occurs through reverse 
transcription of its RNA intermediate and subsequent 
insertion of an LI cDNA copy at a new location in the 
genome [1]. As a result of such propagation, Lis comprise 
-17%, -19%, and -23% of the human, mouse, and rat gen- 
ome, respectively [2-4], Among the 516,000 LI sequences 
identified in the draft human genome, the majority of the 
elements are truncated (usually at the 5' end) LI copies 
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[2]. Only 7046 LI sequences in the reference human 
genome are full-length LI (FL-L1) elements [5], 1000 of 
which have been classified as potentially active [2] in 
terms of retrotransposition. Although only -80-100 ac- 
tive FL-Lls belonging to the LIHs subfamily are 
thought to be present in the reference human genome 
[6], active FL-Lls seem to be more abundant in individ- 
ual genomes [7]. Human FL-Lls are similar in length 
(~6 kb) but heterogenous in sequence composition [5]. 
This heterogeneity results in a spectrum of functional 
capabilities of FL-Lls, ranging from the inability to 
translate the encoded proteins to highly active forms in 
terms of retrotransposition [8]. However, it remains un- 
explored whether any retrotransposition inactive FL- 
Lls are capable of reverse transcription in vivo. 
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A human FL-L1 element contains a 5' untranslated re- 
gion (UTR), two open reading frames (ORF1 and ORF2), 
and a 3' UTR followed by an A-rich tail [9]. The LI 5' 
UTR houses the sense (the first 100 bp) [10] and antisense 
(positions 400-600) [11] promoters. Transcription from 
the antisense promoter is one of the known mechanisms 
involved in LI silencing, and is thought to promote the 
downregulation of transcription from the LI sense pro- 
moter because the resultant bidirectional transcripts are 
processed into small interfering RNAs (siRNAs) [12]. Lis 
with intact ORFs encode two proteins: ORFlp, a nucleic 
acid chaperone, and ORF2p, which possesses endonucle- 
ase (EN) and RT activities [reviewed in [8]]. Both proteins 
tend to associate with their encoding RNA [13], forming 
an LI ribonucleoprotein (RNP) complex that acts as a mo- 
lecular machinery of retrotransposition [reviewed in [1]]. 

It has long been thought that a substantially increased 
retrotransposition rate is linked to a noticeable synthesis 
of FL-L1 transcripts and, therefore, occurs in preimplanta- 
tion embryos [14], several transformed cell lines [15-17], 
and early meiotic spermatocytes [18]. However, recent evi- 
dence shows that retrotransposition occurs mainly in early 
embryonic and cancerous cells, not in the germline 
[19-21]. This suggests that the production of FL-L1 RNA 
per se is not sufficient for retrotransposition, and the fac- 
tors that allow for retrotransposition in embryos but not 
in the germ cell line remain unknown. 

Since the acknowledgement of Barbara McClintock s dis- 
covery of mobile genetic elements [22], the transposition 
and retrotransposition of these elements have been a major 
research focus in this field. Lis have successfully propa- 
gated in the course of co-evolution with their hosts' ge- 
nomes, whereas diverse mechanisms have evolved at the 
genome level to repress the activity of Lis [[8] and refer- 
ences therein]. Given that Lis constitute one fifth of the 
genome, it is logical to surmise that their co-evolution with 
the hosts' genomes has led not only to the evolvement of 
an effective defence system against retrotransposition but 
also to harnessing of Lis for genome functioning. In this 
regard, the mechanisms by which Lis contribute to gen- 
ome functioning remain largely unexplored. It is also not 
known whether the ongoing insertional mutagenesis is 
linked to some programmed LI -dependent processes in 
the nucleus. 

Some efforts have been made to understand the bio- 
logical significance of the abundance of Lis in the genome 
in the context of functionally meaningful elements and 
the abundance of LI transcripts in particular cell types. 
LINEs constitute a substantial portion of scaffold/matrix 
attachment regions (S/MARs) in the human genome [23] . 
S/MARs play an essential role in the organization of chro- 
matin as functional loop domains and thus in the regula- 
tion of transcription and DNA replication [24,25]. This 
suggests that numerous Lis may regulate transcription 



and DNA replication through their involvement in the es- 
tablishment of the three-dimensional (3D) structure of 
chromatin. On the other hand, abundantly expressed FL- 
Lls are known to globally influence gene expression pro- 
files, differentiation state, and proliferation capacity of 
early embryos and many types of cancer, although by 
mechanisms which remain unclear [26]. Thus far, the SI 
MAR- related function of Lis remains unexplored in con- 
junction with their expression status. The global nature of 
cellular processes controlled by abundantly expressed FL- 
Lls suggests that an integrative approach is required to 
study the functional role of upregulated FL-Lls. Specific- 
ally, the role of upregulated FL-Lls should be investigated 
in a broad context of spatio-temporal organization and 
functioning of the genome and chromatin. 

An important point in this regard is that the involve- 
ment of FL-L1 transcripts in the global regulation of early 
development and carcinogenesis seems to be mediated by 
LI RT [26]. This raises the question as to whether sub- 
stantial LI -related reverse transcription exists in early em- 
bryonic and cancer cell systems and, if so, what role it 
plays. A massive nuclear LI -linked reverse transcription of 
unknown functional significance has been reported in the 
mouse zygote and two-cell embryo, which is believed to 
be DNA replication independent [27]. However, this re- 
view will argue that the available data do not allow for def- 
inite conclusions regarding whether or not this LI -linked 
DNA synthesis by reverse transcription is part of the gen- 
omic DNA replication/duplication program. Therefore, it 
is very important to address this question experimentally. 

In this review, an attempt is made to fathom how 
upregulated FL-Lls and their RT globally influence the dif- 
ferentiation state and proliferation capacity of early em- 
bryos and many types of cancer. In this context, the most 
intriguing phenomenon to be explored is the massive nu- 
clear LI -linked reverse transcription found at the onset of 
embryogenesis. It is difficult, if not impossible, to explain 
the global epigenetic role of LI RT and the nature of 
massive LI -linked reverse transcription within the frame- 
work of current concepts. Therefore, conceptual advance is 
the main challenge. Herein, available LI data are revisited 
and examined in concert with relevant findings from the 
fields of replication timing, chromatin organization, DNA 
topology, and epigenetics. The broad picture that emerges 
from this integrative approach favors two novel fundamen- 
tal concepts. First, noncanonical replication of a portion of 
genomic FL-Lls by means of LI RNP-driven reverse tran- 
scription is likely to co-exist with DNA polymerase- 
dependent origin-based replication of the rest of the 
genome during the same round of DNA replication in em- 
bryonic and cancer cell systems. Second, the role of this 
mechanism is likely epigenetic. Moreover, endogenous 
retrotransposition may be associated, to a great extent, 
with failure of this noncanonical DNA replication of an LI 
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unit. An exploration of this hypothesis shows that the 
mechanism of DNA replication is worthy of being retested 
for specific genomic locations (distinct FL-L1 sequences) 
in mammalian early embryonic and cancer cell systems. 
This is important to advance understanding of DNA repli- 
cation, the biology of Lis, and mechanisms of pluripotency 
and carcinogenesis. 

LI RNA and RT are essential for early embryogenesis 
and carcinogenesis 

LI RNAs and RT, abundantly expressed in preimplantation 
embryos and some cancer cell lines, have been targeted in 
numerous experiments to investigate their potential roles. 
These experiments have brought about very important but 
overlooked findings. Specifically, they demonstrate that the 
functional knockdown of LI expression via LI -specific 
RNA interference (RNAi) and the inhibition of RT both 
independently result in the same biological outcomes 
[26,28]. This suggests that both transcription and reverse 
transcription of Lis are links in the same chain in these cell 
systems. 

The expression of Lis has been shown to be involved 
in the establishment of an undifferentiated state and a 
high proliferation rate upon malignant transformation 
of cells. For example, the knockdown of LI expression 
by LI ORF2-specific antisense oligonucleotides drastic- 
ally inhibited 3 H-thymidine incorporation in a dose- 
dependent manner in human transformed hepatoma 
(Hep3B) cells [28]. In the human A-375 melanoma cell 
line, both transient and stable silencing of Lis by ORF1- 
specific RNAi caused a 50-70% decrease in proliferation 
rate and promoted differentiation, as was evident from 
morphological changes and the expression of specific 
markers [29,30]. The transcription of the proliferation 
markers CCND1 and MYC was downregulated in A-375 
derivative cells upon LI silencing [30]. Moreover, both 
transient and stable downregulation of LI expression in 
A-375 cells strongly reduced their tumorigenicity when 
the cells were inoculated in athymic nude mice [26,30]. 
Notably, the targeting of LI ORF1 by RNAi in melan- 
oma cells was concomitant with the drastic reduction of 
translated ORF2p and RT activity in these cells [29,30]. 
Therefore, it is logical to assume that the observed phe- 
nomena are linked to the transcription and subsequent 
translation of FL-Lls. 

The studies performed in early mouse embryos have 
shown that LI transcripts are indispensable for the onset 
of embryogenesis [31]. When antisense oligonucleotides 
targeting the 5' UTR and ORF1 of the T F subfamily of 
FL-Lls were microinjected into the male pronucleus 
18-20 h after fertilization, a complete and irreversible 
arrest of development occurred at the two- or, to a lesser 
extent, four-cell stage [31]. Despite the arrested develop- 
ment, the microinjected embryos remained viable and 



morphologically normal for several days. However, micro- 
injection of an ORF2-specific oligonucleotide neither 
arrested embryonic development nor decreased the RT ac- 
tivity, probably due to a depletion of injected oligonucleo- 
tides through the targeting of 5 '-truncated LI transcripts 
[31]. In contrast, continuous exposure of Hep3B cells to 
the oligonucleotide present in the culture media [28] 
could be an effective means to target LI RNA by ORF2- 
specific RNAi. Despite the ineffectiveness of the ORF2- 
specific oligonucleotide at arresting development, the fact 
that the effect caused by the other two types of oligonucle- 
otides coincided with a significant decrease of the en- 
dogenous RT activity [31] suggests that FL-L1 transcripts 
are essential for the onset of embryogenesis. 

An important question is to whether the role of FL-Lls 
in early embryos and transformed cell lines is due to their 
transcription per se or also due to the involvement of Ll- 
encoded RT. However, the lack of an LI RT-specific in- 
hibitor, the questionable effectiveness of available anti-RT 
drugs, and the abundance of RT expressed from endogen- 
ous retroviruses (ERVs) in embryonic and cancer cells 
[32-34] make this task methodologically challenging. For 
this reason, the effects of downregulated expression of Lis 
versus ERVs have been compared [26]. 

Nevirapine, a non-nucleoside RT inhibitor that inhibits 
endogenous RT, affects early embryos and cancer cell lines 
in a manner similar to the Ll-specific RNAi [26,29,35-37]. 
The exposure of mouse late zygotes and two- and four- 
cell stage embryos to nevirapine caused developmental 
arrest at the preimplantation stages [35]. The effect of ne- 
virapine was dose-dependent, and the arrested blasto- 
meres maintained normal morphology after several days 
in culture [35]. However, nevirapine did not cause devel- 
opmental arrest being added to early zygotes (the first 5 hr 
after fertilization) and later embryos (from the eight-cell 
stage onwards) [35]. Exposure of a variety of human and 
murine tumor cell lines to nevirapine quickly repro- 
grammed them to differentiating derivatives: the cells 
exhibited drastically decreased proliferation rates, globally 
changed expression profiles of several hundred genes, and 
downregulated expression of CCND1 and MFC [[26] with 
a reference to unpublished data, [29,30,38]]. Additionally, 
nevirapine induced the expression of cell-type-specific dif- 
ferentiation markers in many transformed cell lines, in- 
cluding the genetically abnormal acute myeloid leukemia 
(AML) cell lines with t(15;17) PMLIRARA and t(8;21) 
AML1IETO and primary blasts from AML patients 
[29,37,38]. Interestingly, the effect of nevirapine was irre- 
versible in early embryos [36] but reversible in tumor cells 
[29,37,38]. 

The inhibition of telomerase RT is reasoned to be an 
unlikely cause of these phenomena [29,35]. However, the 
interpretation of the nevirapine-caused effects as LI RT- 
dependent was questioned because nevirapine was an 
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ineffective inhibitor of LI RT in cell-based retro- 
transposition assays [39,40]. Nevirapine was ineffective 
when tested on an FL-L1 element [39] at much lower con- 
centrations than were effective in the reprogramming of 
transformed cells [38]. The ineffectiveness of nevirapine at 
inhibiting the synthetic LI RT [40] could be attributed to 
conformational changes of the inhibitor binding "pocket", 
which could arise in this protein made of the LI RT do- 
main and a non-Ll segment and post-translationally modi- 
fied in non-mammalian cells. Despite being ineffective at 
inhibiting retrotransposition in these assays, nevirapine 
nevertheless completely blocked RT activity when tested on 
lysates of F9 mouse teratocarcinoma cells [38], which are 
known to actively express FL-Lls [16]. Efavirenz, another 
non-nucleoside RT inhibitor, decreased the proliferation 
rate and promoted the differentiation of cancer cell lines in 
a manner akin to nevirapine [29,37]. It also was found to be 
an effective LI RT inhibitor in in vitro retrotransposition 
assays when used at similar concentrations [40]. Taken to- 
gether, these data suggest that although nevirapine seems 
to be a less potent LI RT inhibitor than efavirenz, it can in- 
hibit endogenous LI RT when used at high concentrations. 

If nevirapine does inhibit LI RT in vivo, the unrespon- 
siveness of early zygotes, known to have LI RT carried over 
by the spermatozoid [27], and late pre-implantation em- 
bryos, which also actively express FL-Lls [14], requires fur- 
ther investigation. It can be hypothesized that the presence 
of a noticeable lag period between the onset of the exposure 
of two- and four-cell embryos to nevirapine and the devel- 
opmental arrest [35] is because the presynthesized LI RT 
was incapable of binding this drug. The unresponsiveness 
of blastocysts to nevirapine could also be because the con- 
centration of nevirapine reaching cells of the inner cell 
mass (ICM) was too low to cause noticeable effects. 

Actively transcribed and reverse transcribed Lis, rather 
than ERVs, are thought to be a driving force of tumori- 
genic reprogramming and early development progression 
[26]. Ll-interfered A-375 cells exhibited a downregulated 
expression of HERV-K, the biologically most active family 
of human ERVs, whereas a functional knockdown of 
HERV-K did not affect the level of LI expression, prolif- 
eration rate, or phenotype [30]. Consistent with this 
observation, downregulated expression of murine en- 
dogenous retrovirus-like element (MuERV-L) in mouse 
zygotes caused only mild and transient suppression of 
development [41]. Similarly, stable knockdown of ex- 
pression of ERVs in early cloned transgenic pig embryos 
did not interfere with normal embryonic and post-natal 
development [42], 

The phenomenon of massive nuclear reverse tran- 
scription coinciding with a two-fold increase of LI DNA 
copy number in the mouse zygote and the two-cell em- 
bryo as well as the transient nature of this increase (it 
diminishes in blastocysts) [27], strongly suggests that LI 



RNA is actively and transiently reverse transcribed in 
preimplantation embryos. This nuclear reverse transcrip- 
tion could be due to LI RT rather than ERV RT because, 
based on current knowledge, ERVs are reverse tran- 
scribed in the cytoplasm [43]. 

Attempts to explain how LI transcription and reverse 
transcription can be implicated in fundamental bio- 
logical processes in early embryos and cancers have not 
yet brought about any concrete and plausible model. 
Dr. Spadafora and colleagues hypothesize that both Ll- 
dependent transcriptional interference and non-random 
retrotransposition events that are followed, at least in 
embryos, by the excision of a portion of newly inserted LI 
copies might have a role in these cell systems [27,36,38]. 
However, the term "transcriptional interference", defined 
as the activity of one transcriptional unit modified by the 
activity of another [44], does not specify the molecular 
mechanism. Furthermore, the fact that the addition of 
anti-RT drugs to cancer cell lines quickly reprograms them 
to "normal" phenotypes, and their withdrawal abolishes 
this effect, does not favor the hypothesis of genetic 
changes. It is unlikely that LI RT regulates fundamental 
cellular processes by massive retrotransposition in embryos 
and by another means in cancers. Spadafora [36] also hy- 
pothesized that LI RT could be implicated in the substan- 
tial repositioning of chromatin in the nuclei and, therefore, 
in the modulation of expression of other genes. This as- 
sumption was made based on unpublished data, obtained 
in his laboratory, that suggest the nuclei of nevirapine- 
exposed F9 teratocarcinoma cells undergo a reorganization 
of their functional compartments. However, no molecular 
mechanism has been proposed to explain how LI RT can 
be involved in chromatin reorganization. 

A model to explain how LI RNAs and RT are impli- 
cated in the fundamental processes in early embryos and 
certain cancers must address several issues. Specifically 
it should: (i) demonstrate the utility of expressed LI 
RNAs, ORFlp, and ORF2p, taking into consideration 
that ORF2p acts as an RT, synthesizing cDNA; (ii) ex- 
plain why early embryos stop dividing, but transformed 
cells do not show a complete lack of proliferation in re- 
sponse to LI -specific RNAi or RT inhibition; (iii) explain 
why the inhibition of RT (most likely LI RT as discussed 
above) causes irreversible arrest of early embryonic de- 
velopment versus the reversible effects in cancer cell 
lines; and (iv) describe how downregulation of Lis can 
reprogram dedifferentiated cancer cells to their original 
cell types but not to other cell types. An attempt to ad- 
dress these issues is made below. 

LI elements and replication timing programs in 
pluripotent and cancerous cells 

The results of the insightful studies by Dr. Gilberts la- 
boratory on changes in replication timing and chromatin 
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organization linked to the loss of pluripotency in differ- 
entiating embryonic stem cells (ESCs) [45,46] might 
shed light on the role of upregulated Lis in establishing 
an undifferentiated state in a cell 

The replication-timing program is the order in which 
different chromosomal domains are replicated during S 
phase [47]. Genome-wide profiling of replication timing in 
numerous cell types in mouse and human have indicated 
that chromosomes consist of alternating early and late S 
replicating domains [45,48-50]. Multi-megabase replica- 
tion domains are prevalent in differentiated cells, whereas 
alternating small (400-800 kb) early and late S replication 
domains are well represented in mouse and human pluri- 
potent ESCs and in mouse induced pluripotent stem cells 
(iPSCs) [45,49] . Importantly, the replication timing profile 
of the genome is a dynamic, developmentally regulated 
feature that is coordinated with the reprogramming of 
gene expression and repositioning of chromosome do- 
mains within the nucleus [45,46,51]. Differentiation of 
mouse pluripotent ESCs to neural precursor cells (NPCs) 
is associated with replication timing changes that affect 
approximately 20% of the genome [45]. 

There has been an attempt to determine whether 
pluripotency is associated with distinct features of a rep- 
lication timing profile in a genomic context [45]. Two 
features of a replication timing profile were originally 
considered to be characteristic of pluripotent cells [45]. 
One was the presence of small domains that change rep- 
lication timing from early in ESCs to late in NPCs (EtoL) 
and, vice versa, from late to early (LtoE). These changes 
result in the merging of small domains into larger, coor- 
dinately replicating domains with a consequent 40% re- 
duction in number. The interruption of late replicating 
LI -rich AT isohores by small early replicating (EtoL) do- 
mains and early replicating LI -poor GC isohores by late 
replicating (LtoE) domains was also thought to be a fea- 
ture associated with pluripotency [45]. However, it has 
become evident that the consolidation of replication do- 
mains and their alignment to AT and GC isochores were 
more specific to the formation of ectoderm than meso- 
derm and endoderm [46]. Moreover, the improvement of 
the correlation of replication timing to GC/L1 content was 
weaker in differentiating human versus mouse ESCs [49]. 
In terms of replication timing features, the most notable 
"fingerprint" or "indicator" of pluripotency in mice was 
found to be the presence of early S replicating domains that 
reside in a subset of Ll-rich (~27.5%)/AT-rich (-59.7%) 
isochores with an unusually high (for AT isochores) density 
of genes [45,51]. The large EtoL replication-timing switches 
of these domains are strongly associated with loss of 
pluripotency [45,51]. 

A study of replication timing and transcription profiles 
of a variety of independent cell lines representing differ- 
ent stages of early mouse embryogenesis [46] has 



revealed that (i) loss of pluripotency is associated with a 
number of EtoL replication-timing changes, which are 
lineage-independent and completed by the late post- 
implantation epiblast stage prior to germ layer specifica- 
tion and are stably maintained in all downstream lineages; 
(ii) these EtoL changes precede the downregulation of key 
pluripotency transcription factors [POU5F1 (also known 
as OCT4)/NANOG/SOX2]; (iii) these EtoL replication- 
timing changes tend to be accompanied by a repositioning 
of these domains toward the nuclear periphery and a 
downregulation of genes residing in these segments, espe- 
cially those with low CpG density promoters; (iv) the com- 
pletion of lineage-independent EtoL changes coincides 
with a transition of these EtoL domains to a stable silent 
epigenetic state, which is very difficult to reprogram back 
to the pluripotent state in terms of replication timing and 
the expression of genes with low CpG density promoters; 
(v) DNA methylation of genes with low CpG density pro- 
moters within these EtoL domains and activity of several 
chromatin modifying enzymes are not a main cause of the 
established irreversibility; (vi) the acquired stable silencing 
of lineage-independent EtoL domains on autosomes is 
reminiscent of the irreversible heterochromatinization of 
the inactive X chromosome (Xi) in female mammals and 
occurs within the same time frame in development; (vii) 
the subnuclear repositioning of EtoL domains occurs in 
parallel with a dramatic switch to chromatin compaction 
along the nuclear envelope; and (viii) these lineage- 
independent EtoL domains represent 6.1% or 155 Mb of 
the genome. Interestingly, lineage-dependent EtoL and 
LtoE changes, occurring after the late epiblast stage, are 
easier to reprogram back than lineage- independent EtoL 
switches. An important conclusion from this study is that 
loss of pluripotency is associated with establishing a very 
stable epigenetic barrier in the absence of large-scale tran- 
scription changes, and that these epigenetic changes are 
mapped to lineage-independent LI -rich/gene-rich EtoL 
domains [46]. 

It is largely unknown what mechanism drives replication 
timing changes during loss of pluripotency and exactly 
what forces the pluripotency "indicator" domains to repli- 
cate early in ESCs. Rifl protein has been recently identi- 
fied as a key determinant that establishes the replication 
timing program and the size of replication domains in 
mouse embryonic fibroblasts and in human transformed 
HeLa cells [52,53]. Rifl is thought to perform this role 
by attaching certain chromatin segments to the nuclear 
matrix and establishing restricted access to the Rifl- 
bound segments for replication factors in early S phase 
[52,53]. Rifl expression is developmentally regulated [54]; 
however, the functional significance of the expression pat- 
terns and a correlation with pluripotency are not under- 
stood. Although Rifl is highly expressed in totipotent and 
many pluripotent cell types (zygotes, cleaving embryos, 
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ESC lines maintained in vitro, primordial germ cells), it is 
downregulated in the ICM of the blastocyst [54]. Rifl be- 
comes downregulated by the downregulation of OCT4 
and NANOG [55]. Knockdown of Rifl leads to differen- 
tiation of ESCs [55], which suggests that Rifl is impli- 
cated, at least to some extent, in the maintenance of a 
pluripotency-specific replication timing profile. How- 
ever, the lack of a strong correlation between Rifl ex- 
pression and pluripotency [54], the fact that Rifl mainly 
regulates mid-S replication domains, and its role as a 
preventer and not a promoter of early-S replication 
[52,53] suggest that this protein is unlikely to provide 
early-S replication of the EtoL pluripotency "indicator" 
domains. 

A number of observations suggest that late replication is 
the default state of EtoL developmentally regulated do- 
mains, and that an additional as yet unknown property 
must be imposed upon these domains in order to switch 
them to the early replication state [50]. It is worth men- 
tioning that no one has sought to discover whether the ac- 
tive transcription of Lis, found in both human and mouse 
ESCs [45,56,57], plays a role in the early replication of Li- 
rich EtoL domains. In this regard, I propose that specific 
subsets of FL-L1 transcripts, if present, allow for the early 
replication and euchromatinization of the EtoL domains 
to which they map. The downregulation of this transcrip- 
tion may trigger EtoL replication timing switches and 
cause the heterochromatinization of the corresponding 
domains, thus contributing to loss of pluripotency. The 
downregulation of transcription of a different subset of 
Lis might be involved in loss of totipotency. This idea is 
supported by the fact that loss of either totipotency or 
pluripotency coincides with a wave of chromatin compac- 
tion near the nuclear periphery in the absence of large- 
scale changes of transcription profiles [46,58]. Uniformly 
dispersed chromatin fibers of the pronuclei undergo dra- 
matic reorganization in two- and four-cell stage embryos 
when heterochromatin blocks emerge near the nuclear en- 
velope, nucleolar precursor body, and in the nuclear inter- 
ior [58,59]. The first wave of heterochromatinization is 
associated with the loss of totipotency that occurs by the 
eight-cell stage [60] . It is followed by a conversion of chro- 
matin to a highly dispersed conformation in pluripotent 
cells but not in the lineage-restricted trophectoderm and 
primitive endoderm of the blastocyst [58]. The second 
wave of chromatin compaction near the nuclear periphery 
is linked to the loss of pluripotency [46]. It is tempting to 
speculate that, in both cases, similar epigenetic barriers 
would be established through different cohorts of EtoL 
changes accompanied by the downregulation of LI tran- 
scription from these EtoL domains in the genome. 

A surprising finding might be relevant to the putative 
link between upregulated Lis and replication timing fea- 
tures: the replication timing profile of human ESCs 



(hESCs), derived from preimplantation blastocysts, resem- 
bles the profile of more mature mouse EpiSCs, derived 
from the epiblast of post-implantation embryos, but not of 
mouse ESCs (mESCs) [49]. Mouse EpiSCs can be charac- 
terized as cells in which many EtoL domain changes are 
completed, and compact chromatin is accumulated near 
the nuclear envelope [46]. Therefore, a larger portion of 
the genome is likely to be represented by euchromatin in 
mESCs than in hESCs. This can be explained by the fact 
that FL-Lls are ten-fold more abundant in the mouse 
compared to the human genome [61]. It is reasonable to 
speculate that the number of upregulated FL-L1 units per 
genome might also be larger in mESCs than in hESCs. 
This could result in the abundance of early S replicating 
domains in mESCs, but not in hESCs, and lead to the 
euchromatinization of a larger portion of the genome in 
mESCs when compared with hESCs. 

An aberrant execution of the developmental program 
is thought to be an important constituent of carcinogen- 
esis [62] . The characteristic features of replication timing 
profiles of cancerous cells support this view. Findings in 
malignant cells from patients with acute lymphoblastic 
leukemia show that (i) replication-timing changes occur 
in units of the same size range (400-800 kb) as normal 
developmentally regulated replication domains; (ii) more 
than half of these changes align with the boundaries of 
developmentally regulated replication domains; and (iii) 
distinct replication timing changes can be considered a 
"pan-leukemic fingerprint", which slightly overlaps with 
a "pluripotent fingerprint" [63]. An overlap of the repli- 
cation timing profiles of another type of malignant cells, 
teratocarcinoma cells, and pluripotent embryonic cells 
can be even more profound. Teratocarcinoma cells that 
resemble embryonal carcinoma cells as well as cells of 
the ICM [64] are known to develop into normal tissues 
and germ line cells after transplantation to the blastocyst 
[65]. This suggests that the transplanted teratocarcinoma 
cells establish the same "pluripotent" replication timing 
and gene expression profiles as the recipient cells pos- 
sess at the blastocyst stage. It is tempting to speculate 
that this can be achieved, at least in part, due to a simi- 
larity between single-stranded FL-L1 transcription pro- 
files of teratocarcinoma and the ICM cells. In fact, the 
LIHs, L1PA1, and L1PA2 subfamilies equally contribute 
to LI transcript profiles of human embryonal carcinoma 
and ESCs, whereas older subfamilies are differentially 
represented in these cells [57]. 

Epigenetic repertoire of full-length LI transcripts 

Several recent studies have shown that FL-L1 transcripts 
and LI RT are implicated in epigenetic regulation of nu- 
merous genes in normal embryonic development and 
also in tumorigenesis [26,57,66]. However, the nature of 
this epigenetic regulation and the involved molecular 
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mechanism(s) are largely unexplored and invite numer- 
ous future investigations. First, little is known about 
whether the expressed subsets of FL-Lls, and the puta- 
tive epigenetic role(s) they might have, change during 
development. Second, it is not clear whether the active 
expression of single-stranded FL-L1 RNAs regulates the 
state of chromatin, and, if so, whether it promotes 
euchromatinization or heterochromatinization. Finally, 
it is not known whether transcription of an FL-L1 elem- 
ent, FL-L1 RNA, or reverse transcription of this RNA 
regulates or modifies the chromatin state. 

Although sequence profiles of transcribed FL-Lls and 
their changes during development are largely unknown, 
some data demonstrate that the transcription of distinct 
subsets of Lis is likely developmentally regulated and 
stage-specific. Different patterns of expression of FL-Lls 
have been found on the X chromosomes during early (day 
0-4) compared to late (day 8-10) stages of differentiation 
of female mESCs [66]. The precise sequence composition 
of Lis transcribed from the active X chromosome (Xa) 
and the Xi, their localization on the chromosome map, 
and the epigenetic role they might play during early ESC 
differentiation remain unknown. During the late stages of 
differentiation, when transcription of Lis in the nucleus 
and from the Xa is globally reduced, transcription of Lis 
from the Xi is still detectable [66]. This transcription is 
thought to be bidirectional and play a role in the produc- 
tion of siRNAs that promote heterochromatinization in cis 
and thus downregulate neighboring genes that escaped 
Xist-based silencing [66]. Importantly, sense transcription 
of FL-Lls seems to prevail over the bidirectional transcrip- 
tion in ESCs, which then appears to largely shift to bidir- 
ectional transcription of Lis as the cells differentiate. This 
notion is supported by two findings. First, the frequency 
of small RNAs derived from LI elements of T F subfamily 
is two-fold higher on day 5 of mESC differentiation than 
on day 0 [66]. Second, the activity of the LI sense pro- 
moter is markedly more prevalent than the activity of the 
antisense promoter in hESCs, which expresses 10 to 15 
times more sense LI RNA than in differentiated cells [57]. 
Together, these data favor the hypothesis that the LI RNA 
profiles are developmentally regulated. 

Unidirectional (sense) and bidirectional transcription of 
FL-Lls can coexist in a cell, and they likely play opposite 
epigenetic roles. Both types of transcription of Lis have 
been found in ESCs [56,57,66]. Bidirectional transcription 
from the LI 5 ' UTR may contribute to silencing of a por- 
tion of the chromatin domains through siRNA-based 
mechanism in ESCs. At the same time, unidirectional tran- 
scription of another subset of FL-Lls might promote the 
euchromatinization in cis of a different cohort of domains. 

Although no direct evidence demonstrates that sense 
transcription of FL-Lls is implicated in euchromatinization 
in ESCs, this type of transcription of FL-Lls is associated 



with euchromatinization in cancer cells. This is supported 
by the fact that RNAi-based downregulation of the expres- 
sion of FL-Lls as well as the inhibition of RT in 
transformed cells causes the reprogramming of chromatin 
segments to a more compact state in their derivates 
[30,38]. Because unidirectional sense transcription of FL- 
Lls appears to shift to bidirectional transcription upon dif- 
ferentiation of ESCs, it is tempting to hypothesize that the 
epigenetic role of FL-L1 transcripts might change in 
development. 

How FL-L1 RNAs direct or mediate changes of chroma- 
tin conformation and transcriptional activity of neighbor- 
ing genes is largely unknown. There are at least two 
potential types of FL-L1 transcripts in the nucleus — as- 
sembled and unassembled with ORFlp/ORF2p — that 
might have different epigenetic roles and underlying 
mechanisms. Thus, the sense transcription of an FL-L1 
element and/or the transcripts, incorporated in cis into 
the chromatin, are essential for the formation and func- 
tion of a neocentromere and the selective repression of 
genes within or adjacent to this domain [67]. It remains to 
be determined whether these transcripts are assembled 
with ORFlp/ORF2p or not and whether the sense tran- 
scription of FL-Lls inhibits the activity of neighboring 
genes in other genomic locations. FL-Lls, which form LI 
RNP complexes with ORFlp and ORF2p in the cytoplasm 
[68-70], are found in ESCs and many cancer cell lines 
(discussed below). Upon entering the nucleus, such FL-L1 
RNPs might drive a reverse-transcription-based mechan- 
ism linked to the establishment of a totipotent/pluripotent 
state in embryos and an undifferentiated state in many 
cancers. This idea is supported by findings from the FL-L1 
knockdown and RT inhibition experiments discussed 
above. The results of these experiments also favor the idea 
that both transcription and reverse transcription of FL-Lls 
are integral steps of an unknown epigenetic mechanism. 
LI -encoded proteins preferentially associate with and act 
on LI RNA, from which they are translated (a phenomenon 
termed ds-preference) [13,71,72]. Therefore, it is unlikely 
that RNAi-based knockdown of transcription of FL-Lls 
and the inhibition of Ll-encoded RT could target separate 
epigenetic mechanisms and result in the same outcome. In 
this context, the question arises as to what part LI reverse 
transcription plays in this mechanism. 

Massive LI -linked reverse transcription found in mouse 
zygotic pronuclei and nuclei of the two-cell embryo is be- 
lieved to be DNA replication independent for two reasons: 
(i) the exposure of the zygotes to aphidicolin, an inhibitor 
of DNA polymerase, 4 h after fertilization did not block 
DNA synthesis as evidenced by a significant incorporation 
of 5-bromodeoxyuridine (BrdU), the analogue of thymi- 
dine; however, when aphidicolin was used in conjunction 
with abacavir, a nucleoside inhibitor of reverse transcrip- 
tion, the incorporation of BrdU was strongly inhibited; 
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and (ii) this aphidicolin-resistant abacavir-sensitive synthe- 
sis of DNA is observed 4-8 h after fertilization, whereas, 
according to older publications, DNA replication is 
thought to start 8-12 h post- fertilization [27]. The first 
point, namely the interpretation of aphidicolin-resistant 
synthesis of DNA as unrelated to genomic DNA replica- 
tion, is based on the current concept of DNA replication 
that implies genomic DNA is replicated (with the exception 
to telomeres) solely by DNA-directed DNA polymerases 
[reviewed in [73]]. However, it is worth mentioning that the 
current concept of DNA replication, being well-established 
through numerous experiments and entrenched in the 
minds of the scientific community, has not been tested in 
all genome locations in all cell systems in all organisms at 
all possible conditions. Potentially unexplored exceptions to 
the well-known mechanism may exist in distinct genome 
locations and cell systems. Telomerase is a notable example 
of a reverse transcriptase carrying its own RNA molecule, 
which is used as a template to elongate chromosome ends 
[74]. LI RNP could be another example of an enzyme-RNA 
molecular machinery driving genome-wide replication of 
LI sequences. As research has progressed, it has become 
apparent that LI RT and telomerase have remarkable simi- 
larities [[75] and references therein]. The second point with 
respect to DNA replication starting in the zygote 8-12 h 
after fertilization could be fallacious. The references pro- 
vided by Vitullo and co-authors [27], when traced back to 
original publications, lead to results obtained by microden- 
sitometry of Feulgen stained pronuclei [76], a low sensitivity 
methodology. The provided references also lead to publi- 
cations in which dating of post-fertilization events was 
inferred, probably incorrectly, from time passed after 
the injection of human chorionic gonadotropin (HCG) 
[77,78]. More accurate estimations of the timing of pro- 
nuclear DNA synthesis in naturally ovulated and fertil- 
ized mouse eggs of six different genotypes, performed 
by cytofluorometric measurement of ethidium bromide- 
stained DNA, have indicated that the S phase starts 
at ~4 (3.8-4.6) h post-conception and lasts between 6.4 
and 11.1 h in various genotypes [79]. Accordingly, it is 
reasonable to assume that the onset of 3 H-thymidine in- 
corporation in the pronuclei at 21 h post-HCG, which is 
thought to correspond to 7-9 h post-fertilization [77], 
and the onset of labeling with BrdU at 4 hr after 
fertilization [27] can be attributed to the same event: re- 
verse transcription. The similarity of the early labeling 
patterns by 3 H-thymidine and BrdU in male and female 
pronuclei [27,77] supports this notion. It is also worth 
noting that the incorporation of either 3 H-thymidine or 
BrdU can only be interpreted as DNA synthesis but not 
as a particular mechanism thereof. 

The DNA synthesis by reverse transcription found at 
the onset of mouse embryogenesis is thought to be Ll- 
linked [27]. Data obtained by quantitative PCR (qPCR) 



analyses with primers designed to amplify FL-Lls of the 
T F subfamily of Lis demonstrate an approximate two- 
fold increase of the LI DNA copy number per haploid 
genome in the mouse zygote, two-cell embryo, and mor- 
ula [27]; however, the time window and the phase of the 
cell cycle in which the qPCR analyses were performed 
were not indicated. Consequently, the design of the 
above-mentioned experiments [27] has led to results 
that are inconclusive in terms of whether the LI -linked 
DNA synthesis by reverse transcription is DNA replica- 
tion dependent or independent. 

In this regard, it is important to compare LI -related 
qPCR data obtained at two points of the zygotic cell cycle. 
The first point should be during the phase of the cell cycle 
when reverse transcription occurs but DNA polymerase- 
dependent DNA replication has not yet started. The second 
point should be when DNA replication is complete (i.e., in 
G2/mitosis). Although the results of such an experiment 
cannot provide evidence of the nature of the observed Ll- 
linked DNA synthesis by reverse transcription (a potential 
synthesis of extragenomic LI DNA copies cannot be ruled 
out), this approach could be a good starting point to test 
whether this reverse transcription is DNA replication 
dependent or independent. Therefore, the data available at 
this time are not convincing evidence of the massive nu- 
clear reverse transcription occurring in early embryos being 
DNA replication independent. 

Hypothesis and rationale: two modes of LI DNA 
replication as an epigenetic switch 

In this review, I would like to propose that the LI -linked re- 
verse transcription-based DNA synthesis found in early em- 
bryos and also likely to be found in undifferentiated cancer 
cells is part of the DNA replication program in these types 
of cells. This implies that two different mechanisms of 
DNA replication, canonical and noncanonical, can co-exist 
to replicate the genome during the same round of DNA 
replication in early embryos, ESCs, and many cancers. In 
these cell systems, a portion of genomic FL-L1 sequences is 
proposed to replicate by the noncanonical mechanism (i.e., 
LI RNP-driven reverse transcription starting on an LI 
RNA template bound to a complementary "parental" gen- 
omic LI DNA sequence) (Figure 1). The noncanonical 
mechanism is proposed to trigger when FL-L1 RNAs are 
actively transcribed and translated and when full-size LI 
RNPs are assembled. Full-size LI RNP is herein defined as 
consisting of FL-L1 RNA, LI ORF2p, and multiple trimers 
of LI ORFlp (discussed below). Therefore, there can be 
two modes of DNA replication of FL-L1 sequences in the 
genome: canonical and noncanonical. The noncanonical 
mode of replication of FL-Lls is proposed to be LI RT- 
driven, origin-independent DNA replication as a part 
of normal early development. The canonical (DNA 
polymerase-driven, origin-dependent) mode of LI DNA 
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Figure 1 A hypothetical mechanism of noncanonical LI DNA replication. A. Formation of an L1 DNA:RNA duplex. Heterogeneous FL-L1 
RNAs, being assembled with L1 0RF1p and 0RF2p, find their "parental" complementary sequences in the genome and form L1 DNA:RNA hybrids. 
The chaperone activity of 0RF1p, which includes the melting of mismatched duplexes, is deemed indispensable for pairing of the L1 RNA with 
the fully complementary L1 DNA. The displaced DNA strand of an L1 unit is likely stabilized by auxiliary factors. B. First-strand cDNA synthesis. 
0RF2p bound to the 3' end of the FL-L1 RNA nicks the bottom DNA strand and synthesizes the first cDNA strand from the liberated 3'-hydroxyl. 
C. Second nick formation. When 0RF2p reaches the 5' end of the L1 RNA, it nicks the top DNA strand at the 5' end of the L1 element. 0RF2p 
then switches templates from the RNA to the cDNA. The L1 RNA likely dissociates at this point. D. Second-strand cDNA synthesis on the first 
cDNA template. E. Nicking at the genomic DNA-cDNA junctions and the ligation of the segments of the "parental" DNA at the sites of the first 
and second nicks by auxiliary factors. F. Unpairing of the new L1 cDNA strands and their pairing with the "parental" strands by auxiliary factors. 
The ends of the new cDNA strands are joined with the new strands synthesized by the canonical mechanism on the adjacent segments of the 
"parental" strands. Each cell division produces two cells with equal amounts of old and new DNA synthesized by a combination of two 
different mechanisms. 
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replication is likely to replace the noncanonical replication 
in differentiating cells when the synthesis of full-size LI 
RNPs is downregulated. The noncanonical mode of FL-L1 
replication can be recapitulated in cancer cells. 

The next logical question is to why replication of FL-L1 
sequences by either the canonical or the noncanonical 
mechanism is important for a cell The answer could be 
that the switch from the noncanonical to canonical mode 
might be a fail-safe means to keep a large set of embryo- 
specific genes stably silent when the noncanonical mech- 
anism of DNA replication is "off. Specifically, the 



noncanonical mechanism of LI DNA replication may 
serve as a noncanonical epigenetic determinant that 
regulates the transcriptional competence of a large co- 
hort of neighboring genes. This regulation could be 
implemented through the prevention of a set of LI -rich 
EtoL domains from being tethered to the inner nuclear 
membrane (INM) and from being packaged into late- 
replicating facultative heterochromatin (Figure 2A). It 
follows then that when FL-L1 sequences are not repli- 
cated by the noncanonical mechanism, they would tend 
to be silenced due to their sequence composition. 




B 










G4 motif ^ 






* * FL-L1 subjected to noncanonical DNA replication 
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Figure 2 A hypothetical model of two modes of replication of full-length Lis as an epigenetic switch. A: Undifferentiated (early 
embryonic and cancer) cells. A subset of small LI -rich domains of the genome (EtoL domains as per Hiratani and co-authors [45]) replicates DNA 
in early S (red arrow). Transcriptional competence is imposed on these domains during early S phase of DNA replication. These LI -rich EtoL 
domains do not tether to the nuclear lamina and have a loose conformation of chromatin loops. Genes residing in these domains are linked to 
undifferentiated states of a cell. Noncanonical replication of FL-L1 s residing in these domains might prevent them from binding to the nuclear 
matrix and from recruiting the ORCs (panel A, right) and, therefore, from being silenced through the 0RC/HP1 -mediated pathway of 
heterochromatin assembly. B: Differentiated and differentiating cells. Global programmed downregulation of FL-L1 s upon differentiation of 
pluripotent embryonic cells results in switching "off" the noncanonical mechanism of L1 DNA replication. In the absence of L1 RNA paired with a 
complementary "parental" L1 DNA for noncanonical L1 DNA replication, Lis might attach to the nuclear matrix and form intrastrand G4 
structures. The majority of replication origins are associated with G4s [80]. The L1 -bound ORC might bind HP1 in a distinct chromatin 
environment and, thereby, play a crucial role in the establishment of a silent state on the L1 -rich/gene-rich EtoL domains (panel B, right). These 
domains switch to their default state characterized by late replication, dense conformation, and tethering to the nuclear periphery [50] (blue 
arrow). Basically, the same switch from noncanonical to canonical replication of Lis residing within EtoL domains might occur upon 
differentiation of poorly differentiated cancer cells caused by the knockdown of the expression of Lis. 



Belan Biology Direct 2013, 8:22 
http://www.biologydirect.eom/content/8/1/22 



Page 1 1 of 26 



Sequence features of Lis might favor anchoring to the 
nuclear matrix and binding of the origin recognition 
complex (ORC) - two potential mechanisms that may 
contribute to silencing of Lis and adjacent sequences. 
The ORC might facilitate heterochromatin assembly and 
tethering of Lis to the nuclear periphery (discussed 
below). Therefore, origin-based replication of a distinct set 
of Lis might also be considered an epigenetic mechanism, 
which contributes to the default silencing of the involved 
domains (Figure 2B). The noncanonical replication of FL- 
Lis might exert rather specific, albeit different, effects on 
gene expression profiles depending on the subset of poly- 
morphic FL-Lls involved in noncanonical DNA replica- 
tion. This implies that a cell type-specific subset of 
noncanonically replicated FL-Lls determines the cohort of 
Ll-rich EtoL domains that are transcriptionally competent 
in this particular type of cells. 

A notable insight into the initiation of DNA replication 
in eukaryotic systems has brought about the concept of a 
"relaxed replicator" as a "context-dependent element", 
which includes a DNA sequence in conjunction with DNA 
topology, DNA methylation, chromatin-bound proteins, 
transcriptional activity, and short-/long-distance chromatin 
effects [81]. This concept implies that the binding of the 
ORC to chromatin is guided by distinct combinations of 
sequences, chromatin contexts, and components of nuclear 
structure [81,82]. Accordingly, the replicator-initiator inter- 
actions are thought to have an additional function (or func- 
tions) beyond their role in DNA duplication [81]. In 
this context, it is logical to surmise that numerous ORCs, 
which remain bound to DNA by ORC2-5 subunits 
throughout the cell cycle [83], influence the formation of a 
certain chromatin environment through the recruitment of 
chromatin proteins and binding to the nuclear matrix. In- 
deed, a growing body of evidence indicates that the ORC is 
essential for the formation of heterochromatin in eukary- 
otes [84-87]. In mammals, the ORC recruits heterochro- 
matin protein HP1 [86,87]. Factors that facilitate this 
process have begun to be revealed, one of which is an 
H3K9me3 environment [87]. 

In the context of nuclear structure, a significant portion 
of LINEs seem to be ORC-binding sites and function as 
MARs. This is suggested by the fact that origins colocalize 
with MARs [83,88,89] and that human LINEs are overrep- 
resented among S/MARs, comprising 40% of the se- 
quences [23]. The high overrepresentation of LINEs 
among S/MARs could be because S/MARs [90] and LI se- 
quences (discussed below) share a particular feature: par- 
tial unpairing of DNA strands. S/MARs are functionally 
heterogeneous; SARs are mainly transcription-linked, and 
MARs are replication origin/silent gene-associated [25,91]. 
Taking into consideration the functional heterogeneity of 
S/MARs and the tendency of Ll-rich domains to be silent 
and replicate late at the nuclear periphery in differentiated 



cells [45,92], Lis can be even more over-represented 
among the origin-associated MARs than "bulk" S/MARs. 

Several other facts also support the notion that the se- 
quence composition of Lis makes them prone to bind 
ORCs. For example, poly(dA:dT) elements (5 mers or lon- 
ger tracts), known to be present within Lis, disfavor nu- 
cleosome occupancy not only over themselves but also over 
adjacent regions [93]. Low nucleosome occupancy is 
thought to be a necessary, but not sufficient, requirement 
for the assembly of ORCs and pre-replication complexes 
near these regions [94]. Another feature of LI sequences 
that might be favorable for ORC binding is a guanine-rich 
tract known to form an intrastrand tetraplex (G-quadruplex 
or G4) in the LI 3' UTR [95]. This feature is present in all 
Lis with intact 3' UTRs [95] and conserved throughout 
mammalian evolution [96]. About 90% of human origins 
are represented by G4-forming motifs [80], and these struc- 
tures are known to be nucleosome-free regions [97]. Taken 
together, these data suggest that ORCs are highly likely to 
bind to G4 structures of those Lis that tether to the nuclear 
matrix. 

If the sequence features of Lis (G4 structures, the ten- 
dency for partial unwinding, and nucleosome disfavoring) 
do promote ORC binding, the LI -bound ORCs may be es- 
sential for establishing a very stable silent state on Ll-rich 
segments of the genome. One potential mechanism of the 
ORC-dependent silencing of Lis could be the recruitment 
of HP1 to the LI -bound ORCs. HPly, one of the isotypes 
of HP1 associated with the foci of facultative heterochro- 
matin [98], is known to contribute to the silencing of FL- 
Lls [99]. Knockdown of the Cbx3 gene that encodes HPly 
activates repressed Lis [99]. The strong binding activity of 
HPly with lamin B receptor, an integral protein of the 
INM [100,101], could also be involved in the sequestration 
of Lis to the nuclear periphery. The recruitment of HP1 to 
the ORC is guided by H3K9me3 [87]. Although H3K9me3 
is weakly represented on LI sequences regardless of 
whether Lis are active or silent [102-104], H3K9me3 is 
overrepresented within Ll-rich dark (Q or G) bands [105]. 
Therefore, it would be timely to gain insight into the puta- 
tive link between the ORC, HP1, and H3K9me3 with re- 
gard to LI silencing. 

Another potential mediator of the ORC-dependent si- 
lencing of Lis might be an ORC-binding factor ORC A 
(ORC-associated protein). ORCA associates with the ORC 
in the presence of repressive histone marks and methyl- 
ated DNA and functions as a facilitator of heterochroma- 
tin formation [87]. Thus, although it is not completely 
understood how a very stable silent state is imposed on 
Ll-rich domains, G4-forming motifs within LI sequences 
might be landing pads' for the ORC, the important player 
in heterochromatin assembly. 

A genome-wide origin mapping study in hESCs and em- 
bryonic fibroblasts [80] has contributed a very important 
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finding by demonstrating that EtoL developmentally regu- 
lated replication domains acquire some additional origins 
when they switch their replication timing from early to 
late S phase. Despite a general positive correlation of early 
replication with the high density and frequency of the 
usage of origins, the EtoL replication domains had even 
slightly lower origin scores when they were early replicat- 
ing than when late replicating [80]. The exact localization 
of origins on the sequences of EtoL domains could clarify 
whether the additional origins accuired upon the EtoL 
transition during differentiation of pluripotent hESCs are 
Ll-associated. 

Together, these facts favor the hypothesis that Lis 
within developmentally regulated EtoL domains can be 
points of a strong attachment to the INM and peripheral 
nuclear matrix, thus keeping these domains in the default 
silent state. Importantly, such a role may be linked to the 
binding of the ORC by G4 within the LI 3' UTR and, 
therefore, to canonical origin-based replication. LI RNP, 
the molecular machinery of the proposed noncanonical 
LI DNA replication, could be a more successful competi- 
tor for LI sequences than the nuclear matrix and the 
ORC, which would preclude the LI silencing scenario. 
Undoubtedly, Ll-MAR and Ll-ORC relationships need to 
be investigated in differentiated and non-differentiated cell 
systems and viewed in the context of developmentally reg- 
ulated replication domains. 

Relevant to this discussion, are three important points. 
First, experimental tethering of a number of loci to the 
INM causes their downregulation and the repression of 
neighboring genes and genes that are located far from the 
loci. However, experimental untethering by using a com- 
petitor compound that binds the target site induces the re- 
positioning of the locus and adjoining segments away 
from the nuclear periphery and re-establishes transcrip- 
tional competence [106]. Second, transcriptional compe- 
tence is established at the time of replication [107]. Early S 
replicating sequences are assembled into nucleosomes 
enriched with acetylated histones H3 and H4, the marks 
of open chromatin, as opposed to late S replicating DNA, 
which is packaged mainly into silent chromatin marked by 
deacetylated forms of these same histones [107,108]. 
Third, the nuclear periphery, which is essentially a repres- 
sive environment, has early S replicating and transcrip- 
tionally active subcompartments [59,109-111] that appear 
to be more prominent in early embryonic and transformed 
cells than in differentiated cells. 

Taking all of this into account, it can be speculated that 
Lis, being untethered from the INM and repositioned into 
early S replication compartments, could then assemble with 
acetylated H3/H4. Indeed, activation of Lis in HeLa cells 
by a carcinogen, benzo(a)pyrene, increases the H3K9ac 
mark at the LI 5' UTR [104]. As proposed above, LI RNP 
bound to complementary LI DNA and/or noncanonical LI 



DNA replication might favor the untethering of the impli- 
cated chromatin domains from the INM. These liberated 
segments can relocate to the nuclear interior, the location 
of dominating early S replication [112] and transcriptional 
competence. Alternatively, these liberated domains can be- 
come early S replicating and transcriptionally competent 
without noticeable repositioning towards the nuclear inter- 
ior. This idea is consistent with the observation that the nu- 
clear periphery can be almost entirely (mouse zygote) or 
partially (mESCs, many types of cancer cells) represented 
by euchromatin [58,59,113], which appears to replicate in 
early S phase, at least in the zygote [59]. In ESCs, the small 
size of alternating early- and late-replicating domains, to- 
gether with the anchorage of late S -replicating segments to 
the INM [45], suggest that many small LI -rich early S- 
replicating pluripotency "indicator" domains are restrained 
in the nuclear periphery. The localization of LI -encoded 
proteins within the nucleus can be a cue to where LI RNPs 
may act with regard to the nuclear periphery. In A-375 
melanoma cells, LI ORF2p-specific fluorescent signals ap- 
pear as a dense rim in the nuclear periphery and patches of 
sparse speckles that protrude into the nuclear interior [30]. 
However, in the colon cancer cell line HI 299, ORFlp- 
specific signals form multiple foci across the entire space of 
the nucleus [114]. This suggests that LI RNPs may act in 
the nuclear periphery and in the nuclear interior in a cell 
type-specific manner. 

A replication-timing program, which governs the tran- 
scriptional competence of chromosome domains, is 
established during early Gl phase, a short window of op- 
portunity termed the timing decision point (TDP) [115]. 
Post-mitotic re-establishment of 3D chromatin architec- 
ture occurs at the TDP, and developmental cues that 
change a replication-timing program are likely to act dur- 
ing this short time window [115]. If the proposed 
noncanonical replication of Lis does occur and function as 
a regulator of replication timing and spatial positioning of 
the involved domains, LI RNA-L1 DNA interactions for 
DNA replication should be established no later than the 
TDP. This means that the sites of noncanonical DNA rep- 
lication are likely to be licensed from early Gl onward, and 
their licensing could serve as an epigenetic determinant. 
Alternatively, this epigenetic role could be performed by 
noncanonical replication of Lis if it starts at the TDP. The 
latter could be the case during the first round of DNA 
replication in the embryo. Noticeable DNA synthesis by re- 
verse transcription, which precedes DNA polymerase- 
dependent DNA replication in mouse pronuclei [27], could 
be the first phase of DNA replication and serve as the 
epigenetic mechanism implicated in the establishment of 
the initial replication timing program and chromatin archi- 
tecture. From this viewpoint, it is not surprising that the 
DNA synthesis by reverse transcription is more prominent 
in the male than the female pronucleus [27] because the 
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hypercondensed paternal chromatin requires more exten- 
sive reorganization than the maternal chromatin. The 
organization of sperm chromatin favors the early onset 
of LI -related reverse transcription in the male pro- 
nucleus. Specifically, a small portion of the genome is 
undermethylated and packaged with histones into active 
nuclease-hypersensitive chromatin; these segments of 
the genome are highly enriched with Lis [116,117]. 
These LI sequences are found at the periphery of the 
sperm nucleus [116], the same location where pro- 
nuclear reverse transcription occurs [27]. 

Biological significance of LI RNP: a step beyond 
retrotransposition 

Two Ll-encoded proteins, ORFlp and ORF2p, are 
translated in unequal amounts from a bicistronic FL-L1 
transcript [1,118] and bind to the RNA from which they 
are translated [1,13,72]. This suggests that LI RNP func- 
tions as a molecular machinery in vivo. ORFlp forms 
trimers that polymerize under the very conditions that 
support high-affinity nucleic acid binding [119]. Poly- 
merized trimers of ORFlp bind to LI RNA, and one or 
two molecules of ORF2p attach at or near the LI RNA 
poly(A) tail [1,13,72,119]. ORFlp possesses a nucleic 
acid chaperone activity on oligonucleotide substrates 
in vitro; specifically, it promotes accelerated and strin- 
gent annealing of complementary nucleic acid sequences 
by facilitating the melting of imperfect duplexes, strand 
exchange, and the stabilization of perfect duplexes 
[120,121]. However, the biological significance of the 
ORFlp chaperone function is poorly understood. 

First, it is unclear what type(s) of duplexes ORFlp pro- 
motes the formation of in vivo. On one hand, the 
chaperone function of ORFlp has been demonstrated on 
DNA oligonucleotides in in vitro assays [120]; on the other 
hand, ORFlp preferentially binds to LI RNA in vivo and 
in vitro [69,122]. Considering the complementarity of the 
poly(A) tail of LI RNA to the poly(T) segment of a typical 
5' T n /A n 3' cleavage site of LI EN [123], formation of a 
short DNA:RNA duplex is proposed to occur to prime re- 
verse transcription during retrotransposition [120]. ORFlp 
is also speculated to promote the exchange of a DNA: 
DNA duplex to an RNA:DNA hybrid at the target site 
[120]. However, the enormous mass of ORFlp trimers that 
bind to LI RNA [121] seems excessive to merely promote 
the formation of a short RNA:DNA duplex to prime 
cDNA synthesis in vivo. Moreover, because the liberation 
of 3' -OH at the nick site is sufficient to prime reverse 
transcription on an LI RNA template in vitro [124], it re- 
mains uncertain whether such short RNA:DNA duplexes 
are indeed formed to initiate reverse transcription in vivo. 

Second, it is unclear what processes require ORFlp as 
a chaperone in vivo. Its implication in retrotransposition 
might not be the only role it plays. Endogenous LI 



RNAs, which form LI RNPs in hESCs, belong not only 
to retro transpositionally active (LIHs) but also to 
retro transpositionally inactive LI subfamilies (L1PA2, 
L1PA3, L1PA4, L1PA6, and L1PA7) [56]. It is unlikely 
that hESCs synthesize retrotransposition inactive LI 
RNPs having no function. Therefore, ORFlp as a part of 
retro transpositionally inactive LI RNP might play a yet 
unknown role. 

ORFlp is deemed essential for the retrotransposition 
of Lis expressed from LI constructs in transfected cells 
[121,125,126]. This is evidenced by the fact that mutant 
ORF1 proteins with impaired chaperone function but un- 
affected RNA-binding activity abolished or reduced 
retrotransposition in comparison with the wild-type (wt) 
ORFlp in cell-based assays [125]. Although ORFlp is 
non-essential for retrotransposition in a cell-free in vitro 
assay [124], its availability increases the quantity and 
length of nascent cDNAs and promotes the initiation 
of cDNA synthesis at more typical retrotransposition 
start sites [72]. The role of a non-mutant ORFlp in 
the retrotransposition of a "synthetic" LI element in cell- 
based assays might be the same as in a cell-free system. 
Specifically, it could promote the synthesis of a longer 
cDNA strand, including a reporter cassette upstream of 
the LI 3' UTR, so that a retrotransposition event is 
detectable. 

While the integration of a "synthetic" LI element into 
the genome is random [127], the integration of endogen- 
ous Lis seems to be non-random and biased to a similar 
sequence environment. Although post-insertional selec- 
tion and recombination influence the genomic distribution 
of Lis, the non-random integration of endogenous Lis ap- 
pears to be an important factor in the biased localization 
of Lis in GC-poor/AT-rich regions of the genome 
[128-132]. Analyses of the distribution of Lis in mamma- 
lian genomes have led to the conclusion that Lis tend to 
cluster [130,133]. However, there is no current consensus 
on whether clustering is a general feature of Lis [130] or 
more pronounced among old LI elements [133]. The 
100 kb flanking sequences of human Lis of a currently ac- 
tive subfamily Ta-1 (also known as LIHs-Tal) and older 
Lis (L1PA2 and L1PA5) are enriched in LI DNA [130]. 
Interestingly, the sex chromosomes, which are enriched in 
ancestral Lis, are much less hospitable for Ta-1 insertions 
than chromosome 4, which is enriched in Ta-1 elements 
[130]. Although Lis are estimated to insert in pre-existing 
Lis only 13% of the time [134], the portion of Ll-derived 
sequences that harbor new LI insertions can be larger. Re- 
mains of the 3' polyA tails of previous LI insertions that 
bear LI EN recognition motifs are thought to be common 
target sites for LI retrotransposition [123]. 

Despite the incompleteness of our knowledge regard- 
ing the incidence, degree, and length of sequence simi- 
larity between LI insertions and surrounding regions, 
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available data fit the concept of sectorial mutagenesis in- 
troduced by Jurka and Kapitonov [128]. This concept 
implies that new insertions of transposable elements 
tend to occur in specific chromosomal regions. Import- 
antly, the density of LINEs correlates more strongly with 
specific orthologous segments of the human and mouse 
genomes than with the local GC content [3]. 

The factors that determine the non-random integration 
of endogenous Lis and random insertions of "synthetic" 
LI elements remain unexplored. It has been hypothesized 
that the higher frequency of target sites and the open state 
of chromatin could contribute to the insertional bias of en- 
dogenous Lis [132]. The fact that "synthetic" and endogen- 
ous Lis target the same consensus sequence [127,134], but 
demonstrate different patterns of retrotransposition, does 
not favor the notion that the frequency of the target sites 
could be a key factor of non-random retrotransposition of 
endogenous Lis. The open state of chromatin established 
on certain chromosomal domains might be a favorable 
condition rather than a determinative factor for non- 
random retrotransposition of Lis. 

Not excluding other factors that can contribute to the in- 
sertional bias of Lis, I hypothesize that retrotransposition 
of endogenous Lis might be linked, at least to some extent, 
to noncanonical DNA replication. This may cause non- 
random retrotransposition of endogenous Lis if this pro- 
cess fails. More random retrotransposition of "synthetic" 
Lis might be caused by the inability of a reporter cassette 
bearing LI RNA to pair with a complementary sequence in 
the genome to perform noncanonical DNA replication. 
Moreover, the chaperone activity of ORFlp might be essen- 
tial for the recognition of complementary genome se- 
quences by LI RNAs and their pairing. ORFlp that 
promotes the melting of imperfect duplexes may contribute 
to random retrotransposition of "synthetic" and non- 
random retrotransposition of endogenous Lis. In addition 
to the known biased insertions of Lis, other findings 
discussed below favor this hypothesis. 

An unequal potency of retrotransposition among en- 
dogenous FL-Lls capable of producing functional proteins 
is thought to be, at least in part, due to differences in some 
measures of the chaperone activities of ORFlp variants 
[121]. Importantly, a reference point on the scale of 
retrotransposition potency, also often termed as wt, can, 
paradoxically, be a measure of the failure of distinct ORF1 
proteins to perform other biologically essential functions. 
An example of such an LI element in mice could be a 
retrotransposition-efficient variant of Ll spa that encodes 
ORFlp with an aspartic acid codon at residue 159 (D159) 
[121]. In contrast to the D159 variant, another variant of 
Ll spa that encodes ORFlp with a histidine codon (H159) 
at this position is known as a retrotransposition-inefficient 
element [121]. Interestingly, the less active variant, H159 
ORFlp, is much more successful at melting a mispaired 



DNA duplex than the more active D159 ORFlp, which is 
not able to fully melt an imperfect duplex in the absence 
of strand exchange [121,126]. If LI RNP does perform an 
important function on genomic DNA that requires per- 
fect pairing of LI RNA and complementary DNA, the 
efficient melting of mismatched duplexes by ORFlp 
could be essential for displacing LI RNA from a 
mispaired DNA:L1 RNA hybrid and, therefore, for pro- 
moting the formation of completely paired LI DNA: 
RNA hybrids. Consequently, LI RNA that encodes 
ORFlp capable of efficient melting of mismatched du- 
plexes might be less prone to retrotransposition in vivo. 
Sequence composition of Lis favors the formation of 
LI DNA:RNA hybrids reminiscent of long R loops. An R 
loop is an unwound DNA segment, one strand of which 
associates with the complementary RNA, whereas the 
second DNA strand appears as a displaced loop [135]. 
A/T richness and paired stretches of polypurines: 
polypyrimidines, the characteristic features of Lis, are 
required for dsDNA to be prone to the formation of an 
R loop [135]. The formation of R loops spanning several 
kb is possible; however, auxiliary factors are required for 
the unwinding and stabilization of long ssDNA segments 
[135]. 

If the noncanonical mechanism of DNA replication does 
exist and new integration events of Lis are indeed linked 
to their noncanonical replication in early embryos and cer- 
tain types of cancer, then LI retrotransposition can be 
expected to occur in these cell systems rather than in all 
types of cells where Lis are actively transcribed. It has be- 
come evident that retrotransposition of genomic LI ele- 
ments occurs mainly in early embryonic cells but not in 
germline cells, as previously thought [19,136], and in cer- 
tain types of cancer cells but not in normal tissue counter- 
parts [20,21]. Despite the fact that LI RNA is available in 
female germ cells and tremendously abundant in sper- 
matogenic cell fractions, retrotransposition events are rare 
in the germlines; this is in contrast to much more frequent 
integration events in preimplantation embryos [19]. Inter- 
estingly, LI RNA that is retrotransposition inactive in the 
germlines is carried over into the embryo where it remains 
stable and then becomes retrotranspositionally active in 
the cleaving embryo [19]. LI RNA transcribed in the em- 
bryo causes even more retrotransposition events than does 
carried-over LI RNA [19]. Direct evidence of endogenous 
LI retrotransposition associated with LI activation in can- 
cer cells has recently been reported [20,21]. In transgenic 
mice carrying the human LI element, retrotransposition 
events have been found to occur in chemically induced 
skin tumors but not in the adjacent normal skin tissue 
[20]. As shown by two high- throughput LI -targeted 
resequencing methods, retrotransposition of LIHs occurs 
in certain human colorectal tumors but not in the sur- 
rounding normal colon tissues [21]. Importantly, the 
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number of new LI insertions in human colorectal tumors 
was not correlated with the degree of hypomethylation of 
LI promoters [21]. These findings suggest that the activa- 
tion of LI expression as a result of LI demethylation is a 
necessary but not sufficient condition to cause a high 
retrotransposition rate. 

Further investigation of some identified hotspots of LI 
insertions is required to determine conditions and mo- 
lecular processes that might favor LI retrotransposition 
on the genome scale. Such hotspots have been found in 
the vicinity of certain genes expressed in gonads and dur- 
ing embryogenesis [132]. If retrotransposition of Lis is 
linked to their noncanonical replication, LI insertions are 
anticipated to be biased to certain sets of LI -rich EtoL do- 
mains, the early replication and transcriptional compe- 
tence of which is characteristic of either embryonic or 
cancer cells. In this context, it would be interesting to 
study two potential links regarding the LI integration 
hotspots: (i) the link between the LI sequences integrated 
within the hotspots and FL-L1 RNA species carried over 
into the zygote and expressed during development, and 
(ii) the link between these hotspots and Ll-rich EtoL 
developmental!/ regulated domains. 

The functional features of LI ORF2p could potentially 
make it capable of providing the putative noncanonical 
replication of FL-L1 loci. In in vitro assays, LI RT has 
demonstrated a high processivity on both RNA and 
ssDNA templates and the ability to switch templates 
from RNA to cDNA in order to synthesize the second 
strand cDNA [124,137]. This is consistent with the cap- 
ability of Lis to generate full-length insertions in vivo. 
LI EN generates single strand nicks in dsDNA with a 
preference for TA dinucleotides within 5' TTTT/AA 3' 
target tracts; additionally, LI EN is able to efficiently 
nick other sets of dinucleotides within a loose consensus 
sequence [123,138]. From the perspective of the pro- 
posed model, this nicking flexibility might be essential to 
generate two nicks in order to prime first and second 
strand cDNA synthesis. The first nick might occur in 
the bottom strand complementary to an LI A-rich tail, 
which is known to consist of the AATAAA polyA signal 
followed by A n interrupted by short GT- or T-rich mo- 
tifs [139]. A putative location of the second nick could 
be in the top strand at the beginning of the LI 5 ' UTR. 
An interesting nuance is that LI EN activity increases 
dramatically on an unwound DNA helix [123]. 

Together, these findings favor the hypothesis that LI 
RNP functions as the molecular machinery of noncanonical 
replication of LI units in concert with other cellular factors 
that are likely to be available when this mechanism is active. 
Both LI -encoded proteins appear to be indispensable for 
the proposed mechanism, and this implies that only those 
FL-L1 transcripts that are assembled with both proteins 
can function in terms of noncanonical replication. The 



strong preferential binding of ORFlp and ORF2p with 
their encoding LI RNA and the chaperone activity of 
ORFlp can provide a high level of specificity in recogni- 
tion of "parental" LI DNA units subjected to noncanonical 
replication and, therefore, in epigenetic targeting on a 
genomic scale. 

LI RNA and proteins: what, where, and when? 

To better understand the epigenetic role(s) of the activated 
FL-Lls, it is important to determine the patterns of LI 
transcription and synthesis of LI proteins in different types 
of cells and potential links between these patterns and cell 
phenotypes. It has long been accepted that the production 
of FL-L1 RNAs occurs notably in the germline, early em- 
bryos, and many types of cancer cells, whereas it is mainly 
shut down in the majority of normal unstressed somatic 
tissues. However, a highly complex picture of LI -involving 
pathways in a tissues-specific context has started to 
emerge. First, recent research shows that cells from a broad 
range of normal organs actively synthesize FL-L1 RNAs 
[140]; however, the majority of these transcripts undergo 
splicing and/or premature polyadenylation [140-142]. Un- 
fortunately, the scant amount of data on LI RNA se- 
quences and proteins in some organs, e.g., in the placenta 
and esophagus [140,143,144], does not allow for a definite 
conclusion on these patterns. Second, there is some un- 
certainty regarding the interpretation of data obtained 
by methodological approaches appropriate for a less 
complex system. (For a discussion of these issues, please 
see Additional file 1). Consequently, the tissue specifi- 
city of the synthesis of FL-L1 RNAs and full-size LI 
RNPs and correlations with distinct phenotypic features 
can be discussed only with a limited degree of confidence. 
Third, there appears to be different patterns of expression 
of FL-L1 RNAs that might not necessarily result in the 
production of full-size LI RNPs. Therefore, the relationship 
between the synthesis of FL-Lls and phenotypic properties 
might not be straightforward. Finally, the assembling/func- 
tioning of full-size LI RNPs seems to be suppressed in 
gametes, but becomes activated after fertilization. 

Currently, there is no convincing evidence that noticeable 
amounts of full-size LI RNPs are synthesized in either male 
(adult and prepubertal) or female germline cells. Available 
data suggest that either FL-L1 RNA and ORFlp are synthe- 
sized, but ORF2p is missing (as observed in early meiotic 
spermatocytes) or present at a very low level (female gam- 
etes), or LI proteins are produced from shortened LI tran- 
scripts (as observed in secondary spermatocytes and 
spermatids) (Figure 3). Therefore, execution of an RT- 
mediated program might be blocked in germline cells. Spe- 
cifically, in adult human testes, LI -related poly(A) RNAs 
are extremely abundant; however, no FL-L1 RNA has been 
found by Northern blotting because the majority of FL-L1 
RNAs undergo premature polyadenylation combined with 
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Figure 3 Patterns of expression of LI RNA, ORFlp, and 0RF2p during spermatogenesis. Available data suggest that the synthesis of full- 
size L1 RNP consisting of FL-L1 RNA, 0RF1p, and 0RF2p is repressed during spermatogenesis. No L1 -related products have been found in 
spermatogonia. Transient expression of FL-L1 RNA and 0RF1p (but not 0RF2p) occurs in spermatocyte I at the onset of meiosis (leptotene 
through mid-pachytene stage of prophase I). These upregulated FL-L1 s are implicated in chromosome pairing [145]. Transcription of Lis resumes 
in spermatocyte II and continues in spermatids. However, L1 RNAs are mostly short, spliced, and prematurely polyadenylated species that 
translate into either 0RF1p or 0RF2p. Their functional significance is not known. Transcription of Lis, 0RF1p, and 0RF2p is downregulated by the 
spermatozoa stage. A small amount of FL-L1 RNA (not detectable by Northern blotting) is available in spermatozoa. 0RF2p, not found in 
testicular spermatozoa [18], is detectable in the sub-acrosomal space in mature sperm [27]. 



splicing [140-142]. These processed LI RNAs can poten- 
tially translate into either ORFlp or ORF2p or their trun- 
cated forms. Indeed, both ORFlp and ORF2p (or their 
truncated forms, as discussed in Additional file 1) have 
been detected by immunostaining in somatic testicular 
cells, secondary spermatocytes, and immature spermatids 
in adult human testes [143]. Similarly, no FL-L1 RNA has 
been detected in adult mouse testes by Northern blotting, 
whereas short LI transcripts of variable lengths were abun- 
dant in both germ and somatic cells [18]. In adult mouse 
testes, ORFlp-related immunostaining has been detected 
in somatic cells and spermatids, but no ORF2p-specific im- 
munostaining has been revealed [18]. 

Although processed LI transcripts prevail in adult tes- 
tes, FL-L1 RNAs, which are undetectable by Northern 
blotting, might be present in early meiotic (leptotene and 
zygotene) spermatocytes. This cell fraction is rare in adult 
mouse testes but is much better represented in prepuber- 
tal testes where it accounts for the abundant ~7 kb sense- 
strand LI transcripts [18]. The transient expression of Lis 
and ORFlp coupled with LI DNA demethylation is 



intrinsic to the onset of normal meiosis (leptotene through 
mid-pachytene stage) in every round of spermatogenesis 
[145,146]. This type of LI expression, downregulated in 
late meiotic prophase I, is unrelated to the production of 
processed LI transcripts triggered later in spermatogenesis 
[18,145]. The transient expression of FL-Lls and ORFlp is 
proposed to be a programmed, though not understood, 
event associated with chromosome pairing and assembly 
of the synaptonemal complexes in male meiosis [145]. Be- 
cause LI retro transposition is highly repressed in the 
germline compared with early embryogenesis [19], and 
ORF2p appears to be unavailable in early spermatocytes, it 
can be speculated that the LI RNPs, implicated in early 
male meiosis, are not full-size LI RNPs. Similar to the on- 
set of male meiosis, ORFlp is transiently expressed in fe- 
male germ cells entering meiotic prophase I in the mouse 
embryonic ovary [144], suggesting the same role of LI ex- 
pression in chromosome pairing. 

Another category of germ cells likely to accumulate 
small amounts of FL-L1 RNA are male and female gam- 
etes. This is supported by the fact that FL-L1 RNA 
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carried over into the zygote by both gametes causes de- 
tectable retrotransposition events in the embryo [19]. 
Because the carried-over FL-L1 RNA remains stable and 
capable of retrotransposition during early embryogenesis 
[19], it might be implicated in the LI -linked RT- 
dependent synthesis of DNA not only in the zygote but 
also in the cleaving embryo. While small amounts of FL- 
Ll RNA seem to be present in both gametes, it remains 
unexplored whether this RNA is assembled into RNP 
with one or both LI -encoded proteins. The synthesis of 
ORF2p is downregulated in testicular sperm cells [18] 
but appears to resume in the epididymal spermatozoa 
because ORF2p is found in the sub-acrosomal space of 
these cells [27]. Therefore, the synthesis of ORF2p seems 
to restart at the terminal stage of spermiogenesis when 
the synthesis of ORFlp is downregulated. As shown by 
immunostaining, ORFlp and ORF2p are barely detect- 
able at the terminal stages of mouse oogenesis [27,144]. 
In a full-size LI RNP, ORFlp is typically present in great 
excess compared to ORF2p [1]; therefore, weak ORFlp- 
specific immunostaining can reflect the downregulated 
synthesis of ORFlp in oocytes. Because of the paucity of 
ORF2p present in LI RNP, [1], weak ORF2p-specific im- 
munostaining dispersed within the cytoplasm of the oo- 
cyte [27] may not suggest the lack of ORF2p if compared 
with the amount of ORF2p in the epididymal spermato- 
zoid. Together, these findings favor the assumption that 
small amounts of FL-L1 transcripts can be stored in both 
male and female gametes, but the formation of FL-L1 
RNA/ORFlp/ORF2p complexes might be blocked due to 
the downregulated synthesis of ORFlp. ORF2p, which is 
synthesized at the very terminal stage of sperm maturation 
and also seems to be present in the oocyte, could be des- 
tined to initiate the synthesis of LI DNA by means of re- 
verse transcription in both zygotic pronuclei. 

Preimplantation embryos likely synthesize full-size LI 
RNPs; however, systematic studies of LI RNAs/ORFlp/ 
ORF2p are required for definite conclusions. Strongly 
upregulated expression of Lis [147], noticeable RT- 
dependent DNA synthesis, and the significant increase of 
LI copy number in two-cell mouse embryos [27] suggest 
that LI RNPs are likely present and function during this 
stage. The abundance of sense-strand FL-L1 transcripts in 
mouse blastocysts [14] and the presence of FL-L1 RNAs 
and ORFlp assembled into RNPs in hESCs and iPSCs 
[56,148] favor the idea that full-size LI RNPs can be 
present at least in pluripotent cells of the blastocyst. 

The exact developmental window when such RNPs are 
formed remains to be determined. Although genome-wide 
intense upregulation of Lis occurs and plays an important 
role in preimplantation embryos, the less apparent pro- 
duction of sense-strand FL-L1 RNAs and proteins can still 
be present or transiently reinstated in distinct lineages or 
cell types later in development. The possibility of LI 



expression and retrotransposition in human neural progeni- 
tor cells is suggested by the increased copy number of en- 
dogenous Lis in adult brains when compared with heart 
and liver samples obtained from the same individuals [149]. 
Moreover, mouse myogenic precursors, the differentiation 
of which is promoted by nevirapine [38], could also be a cell 
type that synthesizes some amount of full-size LI RNPs. 

Several types of cancer cells also seem to synthesize 
full-size LI RNPs. With regard to LI -related products, 
the most studied cancer cells are cell lines derived from 
germ-cell tumors, mostly testicular, that are embryonal 
carcinomas and teratocarcinomas (teratomas with an 
embryonal carcinoma component) [64]. Embryonal car- 
cinoma cells are highly malignant counterparts of the 
ICM: they express pluripotency markers and can be 
maintained as undifferentiated cells or induced to differen- 
tiate by morphogens [64,150]. Mouse F9 and C44 embry- 
onal carcinoma and human NTera2Dl teratocarcinoma 
cell lines are known to actively synthesize sense-strand 
FL-L1 RNAs [15,16], These transcripts form RNPs with 
ORFlp in F9 and C44 cells [16,68]. The presence of RT ac- 
tivity associated with LI RNPs in NTera2Dl cells [151] 
suggests that full-size LI RNPs may be synthesized in 
these cells. The fact that the malignant pluripotent cells 
originate from germ cells but not other cell types could be 
explained by the proposition that LI RNAs, synthesized 
during gametogenesis and carried over into the zygote, 
have a pluripotency-linked function in the early embryo. 

FL-Lls and ORFlp are also upregulated in a range of 
tumors and transformed cell lines, and this upregula- 
tion correlates with a transition to undifferentiated phe- 
notypes, higher tumor grade, and poorer prognosis 
[17,114,140,152-154]. Despite the lack of parallel ana- 
lyses of ORF2p in many studied cancers, the results 
discussed in the second section of this review suggest 
that ORF2p is also present in numerous poorly differen- 
tiated tumors. Consequently, the synthesis of LI RNPs 
is likely a characteristic feature of many cancers. 

In addition to many types of cancers, the activation of 
Lis might be intrinsically linked to cell dedifferentiation in 
certain regenerating cell systems. For example, LI -like 
retrotransposon that encodes ORF1 and ORF2 is dramatic- 
ally upregulated in the blastema during axolotl (Ambystoma 
mexicanum) limb regeneration [155]. This activation of Lis 
slightly precedes the upregulation of a limb regeneration 
marker [155]. Interestingly, the completion of the regener- 
ation of the amputated limb was accompanied by a 16% in- 
crease in LI DNA copy number [155]. Surprisingly, the 
second wave of regeneration after re-amputation of the 
same limb resulted in a 70% increase in LI DNA copy 
number [155]. Although the nature of this enormous 
increase in LI copy number is not known, the authors 
interpret their data as retrotransposition. It is tempting 
to speculate that the herein proposed noncanonical 
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mechanism of LI DNA replication might be recapitulated 
in blastema to allow cell dedifferentiation. Moreover, the in- 
crease in LI DNA copy number after the completion of the 
regenerative process could be due to the accumulation of 
extrachromosomal LI DNA copies. The synthesis of epi- 
somal LI DNA copies (discussed below) and their stockpil- 
ing might be part of a cell "memory" mechanism aimed to 
accelerate noncanonical LI DNA replication and dediffer- 
entiation in response to a repetitive severe injury. It has 
been reported that repeated amputation of the axolotl limb 
results in accelerated regeneration [156], although the 
underlying mechanism is not understood. 

Together, the analysis of LI expression in a cell type- 
specific context shows that a correlation between a notice- 
able production of FL-L1 RNAs and cell phenotypic 
properties is not straightforward. Importantly, the produc- 
tion of FL-Lls might not necessarily always lead to the syn- 
thesis of both LI -encoded proteins and the formation of 
full-size RNPs. This may occur at the onset of meiosis and 
during the terminal stages of gametogenesis. The synthesis 
of full-size LI RNPs in mitotically dividing cells appears to 
be strongly implicated in establishing gene expression pro- 
files characteristic for totipotent/pluripotent and poorly 
differentiated cells. 



A shift in the current LI paradigm: has the time come? 

Barbara McClintocks theoretical postulates on trans- 
posable genetic elements [22] were met with enduring 
reluctance, but this reluctance eventually evolved into 
acknowledgement of her discovery and revolutionary 
concept. Paradoxically, this now widely accepted con- 
cept seems to have become a barrier that impedes con- 
ceptual advances in LI research. 

The current LI paradigm can be described as 
retrotransposition-centered: (i) retrotransposition is the 
only RT-dependent function of Lis considered so far; (ii) 
the drastic upregulation of Lis in early embryos and can- 
cers is often deemed a non-specific response to general de- 
methylation of the genome because it cannot be intended 
for retrotransposition, and other possible functions are usu- 
ally not considered; (iii) the upregulation of endogenous 
Lis is usually thought to be a sufficient condition for 
retrotransposition despite the lack of a correlation between 
the abundantly expressed Lis and retrotransposition in the 
male germ line [19]; (iv) while retrotranspositionally active 
Lis are under scrutiny, retrotranspositionally inactive FL- 
Lls are neglected as elements that might be reverse tran- 
scribed and play an essential role in a cell; and (v) the 
attributed function of premature polyadenylation and spli- 
cing of LI transcripts known to occur in many tissues is to 
defend against retrotransposition [140-142]; however, it is 
unlikely that LI RNA is synthesized and processed merely 
to be non-functional. 



The adherence to this retrotransposition-centered para- 
digm is reflected in the scarcity of research exploring other 
potential LI RT-driven mechanisms. The adherence to the 
current paradigm is also evident in the interpretation of 
data demonstrating significant increases in LI DNA copy 
number in the mouse zygote and cleaving embryos [27] as 
well as in regenerated axolotl limbs [155] as a result of nu- 
merous retrotransposition events. Although the reported 
increase in LI DNA copy number may be partially caused 
by retrotransposition events, it is unlikely that retro- 
transposition is the sole LI RT-dependent process in these 
cell systems. The activation of Lis in colorectal tumors is 
accompanied by 0 to 17 new insertions per tumor sample 
[21]. Even if the degree of LI activation in the early mouse 
embryo and regenerated axolotl limb is higher than in tu- 
mors, LI retrotransposition rates in these cell systems are 
unlikely to be many times higher than in cancers. Conse- 
quently, other possible LI RT-driven mechanisms are 
worth exploring. 

One such mechanism could be the synthesis of extra- 
chromosomal LI DNA or LI DNA-containing sequences. 
Abundant extrachromosomal circular LI DNA-containing 
products have been found in yeast [157] and certain types 
of cancer cell lines [158]; however, the biological signifi- 
cance of these products remains unknown. The extra- 
chromosomal LI DNA copies might be the cause of 
significantly increased LI DNA copy numbers in the 
regenerated axolotl limbs. The extrachromosomal LI 
DNA copies may also be temporarily synthesized during 
early embryogenesis, thereby causing the amplification 
of LI DNA copy number. 

The second potential LI RT-driven mechanism is the 
noncanonical LI DNA replication proposed in this review. 
In early embryos, this mechanism could account for the 
qPCR-detectable amplification of LI DNA copy number 
in time windows when only LI DNA is replicated (by the 
noncanonical mechanism) in all or some embryo cells. 

These two mechanisms may co-exist, interplay with 
each other, and be important for the establishment of an 
undifferentiated state of a cell. The noncanonical LI DNA 
replication mechanism could serve as an important epi- 
genetic mark that determines early replication of LI -rich 
developmentally regulated EtoL domains, whereas the for- 
mation of extrachromosomal LI DNA copies could be an 
auxiliary molecular tool in support of it. 

The proposed model implies that the noncanonical LI 
DNA replication mechanism is normally executed in the 
totipotent and pluripotent cells of early embryos. Its ini- 
tiation and primary specificity of the involved genomic 
domains is thought to be determined by a subset of LI 
RNAs carried over into the zygote. The upregulation of 
the expression of FL-Lls at the two-cell stage and the 
gradual changes of LI expression profiles during preim- 
plantation development are deemed essential for the 



Belan Biology Direct 2013, 8:22 
http://www.biologydirect.eom/content/8/1/22 



Page 1 9 of 26 



establishment of stage-specific gene-expression profiles. 
Noncanonical replication can potentially be triggered in 
differentiated somatic cells causing cell dedifferentiation 
and transformation, but not pluripotency because, the 
embryo- and cancer-specific profiles of FL-L1 RNAs are 
established under the influence of different factors. 

From the standpoint of the proposed model, the un- 
solved LI -related issues mentioned in the second section 
of this review can be explained. Specifically, the co- 
expression of FL-L1 RNA and RT as well as DNA synthe- 
sis by reverse transcription, coinciding with the two-fold 
increase of the LI DNA copy number in the early em- 
bryos, can be biologically explained. The model also ex- 
plains the different responses of early-cleaving embryos 
and transformed cells to LI knockdown and RT inhibition, 
specifically the complete cessation of divisions versus the 
continued proliferation at a lower rate. In the zygote, and 
to some extent the cleaving embryo, the specificity of a set 
of domains affected by noncanonical LI DNA replication 
likely depends on LI RNA delivered by the gametes. The 
degradation of LI RNA by the LI -specific RNAi at the on- 
set of embryogenesis does not allow the proper repro- 
gramming of the genome. The same situation applies to 
the effect of RT inhibitors at this embryonic stage. The in- 
ability of a cell to proceed with proper spatial genome re- 
positioning rather than the failure to complete a DNA 
replication round can be a consequence of LI targeting. 
Those LI RNPs that are bound to the genome for DNA 
replication may be less likely targets than cytoplasmic 
molecules. This notion is supported by the fact that the 
targeting of either LI RNA or RT in transformed cells does 
not arrest the cells at a distinct point of the cell cycle. In 
poorly differentiated cancer cells synthesizing LI RNPs, the 
experimental impediments to the putative noncanonical 
replication of Lis might switch the involved domains into a 
silent state. As a consequence, the gene expression profiles 
of transformed cells may change to those reminiscent of 
their normal counterparts. This may or may not cause a 
steady transition to normal cell functioning, depending on 
the "strength" of counteracting transforming factors and 
what point of the noncanonical replication mechanism has 
been targeted. Both of these aspects could explain the rein- 
stated transformed phenotypes in a number of RT 
inhibitor-treated cancer cell lines after the withdrawal of 
the inhibitor. A clue as to why dedifferentiated transformed 
cells reprogram to their normal counterparts but not to 
other cell types upon the downregulation of Lis comes 
from the finding of lineage-dependent EtoL domains that 
are silenced during the specification of lineages [46]. These 
domains can more easily be reprogrammed back than 
pluripotency "indicator" EtoL domains. The changes of the 
replication timing of a portion of lineage-dependent EtoL 
domains might also be driven by the switch from 
noncanonical to canonical replication of the resident Lis. 



The majority of pluripotency "indicator" domains are likely 
to remain silent in most cancers, except for embryonal car- 
cinomas and teratocarcinomas, whereas lineage-dependent 
EtoL domains might be commonly implicated in malignant 
dedifferentiation. Therefore, their silencing could favor 
the reprogramming of transformed cells into the path- 
way of their original lineage-specific differentiation. The 
proposed model can also explain why the epigenetic 
barrier established on the Ll-rich EtoL pluripotency 
"indicator" domains is very stable. If LI transcripts car- 
ried over by gametes into the zygote and synthesized in 
the early embryo under their direct influence do estab- 
lish early replication of EtoL pluripotency "indicator" 
domains, the lack of such transcripts can impose a very 
stable silencing on these domains. 

Some additional findings may or may not contradict 
the proposed model. First, it is not clear whether the re- 
sults of cloning experiments fit the model. The model 
implies that the carried-over FL-L1 transcripts delivered 
by gametes and ORF2p are indispensable to set up the 
initial 3D genome architecture and replication timing 
program through noncanonical LI DNA replication. Be- 
cause the metaphase II oocyte, the common recipient 
used for somatic cell nuclear transfer [159], contains nu- 
clear factors in its cytoplasm, FL-L1 transcripts might be 
available in the ooplasm if not bound to chromatin. 
ORF2p is present in the epididymal spermatozoa [27]. 
However, it is not clear whether ORF2p is lacking in the 
oocyte or a small amount of ORF2p is dispersed within 
the cytoplasm and is therefore barely detectable. The 
ability of the ooplasm to support reprogramming of 
transplanted nuclei of somatic cells to the totipotent 
state challenges the significance of LI RT delivered by 
spermatozoa for genome reprogramming at the onset of 
embryogenesis. 

Second, it is unclear whether the density of FL-Lls 
within the pluripotency "indicator" and certain lineage- 
dependent EtoL domains is high enough to control teth- 
ering and untethering of these domains with regard to 
the nuclear lamina. The average length of lineage- 
specific Lis peaks at regions with a GC content of 39- 
40% in the human and mouse genomes [3] suggesting 
that FL-Lls might accumulate in the pluripotency "indi- 
cator" domains, which have exactly the same GC con- 
tent [45,51]. In contrast to the mouse genome, which 
has -3000 potentially active FL-Lls [160], the hu- 
man genome harbors only -85 retrotranspositionally ac- 
tive copies of -7000 FL-Lls [134]. Nevertheless, some 
retrotranspositionally inactive FL-Lls might be capable 
of reverse transcription in vivo. Because the number of 
FL-Lls capable of reverse transcription remains unclear, 
it is perplexing whether the subset of reverse-transcribed 
FL-Lls is large enough to establish transcriptional com- 
petence for a large cohort of genes. 
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The hypothesis proposed in this review is testable. The 
simplest experimental model to test whether noncanonical 
LI DNA replication occurs would be one-cell mouse em- 
bryos. Two factors favor this experimental model: the RT- 
dependent phase of DNA synthesis in zygotic pronuclei 
precedes the DNA polymerase-dependent DNA replication, 
and the time frames of these events have been defined [27]. 
Two sequential labelings of synthesizing DNA with haloge- 
nated nucleotides (e.g., IdU and CldU) during these two 
phases, and the subsequent visualization of their incorpor- 
ation by fluorescently labeled antibodies on stretched DNA 
fibers combined with parallel LI DNA-specific fluorescence 
in situ hybridization (FISH), is expected to be informative. 
The labeling protocol introduced for the single-molecule 
analysis of replicated DNA [161,162] can be coupled with 
proper modification of the method of microfluidic extrac- 
tion and the stretching of DNA from single nuclei [163]. A 
modification of the method of microfluidic stretching of 
DNA is required to provide better resolved DNA fibers. 
The lack of data regarding whether the RT-dependent 
phase of DNA synthesis exists in ESCs and certain 
transformed cell lines, whether it overlaps with or pre- 
cedes the DNA polymerase-dependent phase, and 
whether the cells would be able to resume DNA synthe- 
sis after the withdrawal of aphidicolin makes the suit- 
ability of the same approach suggested for one-cell 
embryos uncertain. Additionally, Chlp-seq of either 
BrdU-labeled nascent DNAs or nascent DNA-ORF2p 
complexes obtained from aphidicolin-treated ESCs and 
transformed cell lines could be considered. The knock- 
down of specific subsets of FL-Lls, and the inhibition of 
LI RT in ESCs and transformed cell lines, followed by 
analyses of replication timing, gene expression, and SI 
MAR profiles at the genomic scale could clarify whether 
activated FL-Lls regulate gene expression through the 
establishment of replication timing and S/MAR profiles. 

Prompted by anti-tumor effects of RT inhibitors in ex- 
perimental models, an attempt was made to employ ne- 
virapine for the treatment of non-HIV cancer patients in 
a small clinical trial [164]. This clinical trial was also 
based on positive outcomes of RT inhibitor-based treat- 
ment regimes for HIV-related tumors, which could par- 
tially be attributed to a direct anti-cancer activity of the 
drugs [165]. However, this approach did not lead to the 
anticipated result because nevirapine appeared to be toxic 
to some non-HIV- infected cancer patients [164] and was 
perhaps a suboptimal inhibitor of LI RT. From the stand- 
point of the model proposed here, targeting the LI RNP- 
driven process at the RT level might be an ineffective 
means to obtain the irreversible differentiation of cancer 
cells even if highly specific anti-Ll RT drugs are used. 
Preventing the licensing of sites of noncanonical replica- 
tion might be a more fruitful approach to obtain sustained 
differentiation of cancer cells. Uncovering the biological 



significance and the mechanism of LI RT-dependent 
DNA synthesis would inform the development of highly 
targeted anti-cancer therapies and new approaches to con- 
trol the reprogramming of differentiated cells into iPSCs. 
In addition, more detail on the sequences of the FL-L1 
RNAs forming the full-size LI RNPs in cancers would 
open a new avenue in the field of cancer biomarkers. 

Conclusions 

Available data demonstrate that several LI -related phe- 
nomena cannot be explained within the framework of 
the current retrotransposition-centered LI paradigm. A 
novel concept is required to explain the nature of 
massive LI -linked reverse transcription at the onset of 
embryogenesis and how abundantly expressed FL-L1 
RNA and RT can globally control the epigenetic state of 
a cell. A revised LI paradigm should put into focus the 
possibility of LI RT-driven biologically significant pro- 
cesses other than retrotransposition. 

A new concept of noncanonical LI DNA replication 
that could exist in early embryos, ESCs, and certain 
types of cancer has been introduced in this article. This 
proposed model links undifferentiated states of a cell, 
such as totipotency, pluripotency, and regeneration- 
/cancer-related dedifferentiation to this mechanism. The 
hitherto unexplained phenomena that demonstrate cru- 
cial though different outcomes of the downregulation of 
Lis and RT in early embryos and cancers can also be 
explained. First, the proposed model assigns a biological 
function to upregulated FL-Lls, Ll-encoded proteins, 
and LI -linked reverse transcription. Second, it suggests 
how the LI RNP-driven process could potentially result 
in transcriptional competence of specific domains of the 
genome that harbor genes associated with undifferenti- 
ated states. Moreover, the model demonstrates how the 
LI RNP-driven process could integrate with other funda- 
mental processes in the nucleus. Finally, the model 
shows how the whole system might be regulated in de- 
velopment and dysregulated in cancer. 

An important aspect of this novel concept is that it 
links retrotransposition of endogenously expressed Lis 
to the putative noncanonical LI DNA replication. Evi- 
dence supporting this claim is provided. Endogenous LI 
retrotransposition is clearly non-random, but seems 
biased to a similar sequence environment. In addition, 
LI retrotransposition mainly occurs in proliferating un- 
differentiated embryonic and cancer cells, but not in all 
types of cells where Lis and FL-Lls are abundantly 
expressed. 

Although the current model of DNA replication seems 
robust, it should be retested in specific genome locations 
(distinct FL-L1 sequences) in early embryonic and can- 
cer cell systems. This is suggested by the failure of the 
prevailing LI paradigm to explain several important LI- 
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related phenomena and the plausibility of the proposed 
model of noncanonical LI DNA replication. 

Reviewers' reports 

Reviewer 1: Dr. Philip Zegerman, Wellcome Trust/Cancer 
Research UK Gurdon Institute, University of Cambridge, 
Cambridge, UK (nominated by Dr. Orly Alter, University of 
Utah, Salt Lake City, USA) 

Understanding the physiological roles of transposable el- 
ements is an important biological question. This review 
aims to link the transcription and duplication of LI ele- 
ments to other cellular processes including replication 
timing and changes in the chromatin state. 

This review would have benefitted from a clearer and 
more precise analysis of key experiments in a defined 
manner. Instead sweeping conclusions are made from 
some sparse data e.g. "the data available at this time show 
no evidence that the massive nuclear reverse transcription 
occurring in early embryos is DNA replication independ- 
ent." p.24, yet the aphidicolin experiment in ref 27 clearly 
demonstrates the opposite. 

Response: Indeed, data used in this review are often in- 
sufficient for definite conclusions. This is not surprising 
because the issues discussed and questions raised in this 
paper have never been addressed experimentally. How- 
ever, when sparse data accumulate to the necessary 
threshold, I think it is timely to draw the attention of the 
research community to inter pretational or conceptual is- 
sues. Some findings have been reported by authors as 
minor details, but they have a certain value when viewed 
in a new context or linked with other data and, therefore, 
are worth being included in this review. 

The requirement for well-supported conclusions to be 
based on strong evidence is appropriate for a paper that 
employs the deductive approach. This review, on the con- 
trary, is an inductive paper. I recognize the original text 
contained some generalizations that could sound as sweep- 
ing conclusions, and thus I have critically reassessed the 
text and changed wording in some instances. 

I do not agree with the latter comment regarding the 
text on p. 24. The concluding sentence of the section that 
is cited is taken out of the context. It summarized the 
there main points of the preceding discussion: 1) the in- 
terpretation of aphidicolin-resistant abacavir-sensitive 
synthesis of DNA by reverse transcription in the zygote as 
DNA replication independent was based on the current 
concept of DNA replication, which may not be compre- 
hensive; 2) there were some overlooked timing issues 
related to initiation of DNA synthesis in the zygote, 
which question the conclusion made in ref. 27; and 3) 
there were some drawbacks in the design of the experi- 
ments described in ref. 27, which made the experiments 
inconclusive in terms of whether the DNA synthesis by 



reverse transcription in the zygote was DNA replication 
dependent or independent. 

I would like to emphasize that the endurance of a par- 
ticular scientific hypothesis does not make it an ultimate 
truth. It is reasonable to interpret new results on the basis 
of a particular hypothesis until some data that support 
new testable predictions are obtained. I suggest that this is 
the case with the current concept of DNA replication. To 
this end, I have strengthened this point in the paper. I have 
also made small changes to clarify the point that experi- 
ments in ref. 27 were inconclusive with respect to their 
claim that the DNA synthesis by reverse transcription in 
the zygote was DNA replication independent. 

Another example would be the statement "these data 
suggest that ORCs are highly likely to bind to G4 struc- 
tures of Lis", p.28. 

Response: / have clarified this statement by including 
an additional point from the preceding discussion: "these 
data suggest that ORCs are highly likely to bind to G4 
structures of those Lis that tether to the nuclear matrix." 
The logic underlying this statement is below. About 
225,000 active origins (90% of all active origins) are asso- 
ciated with G4s [80]; however, the number of inactive 
G4-bound ORCs is not known. Lis are a substantial 
source of G4-forming sequences. Given that the human 
genome contains -516,000 Lis, most of which are trun- 
cated at the 5' (not at the 3') end [2], and all Lis with 
intact 3 ' UTRs contain a G-forming tract [95], the num- 
ber of G4-forming sequences can be significantly greater 
than current estimates of 375,000 [Todd AK et al, Nucleic 
Acids Res, 2005, 33:2901-2907]. Despite the growing inter- 
est in the G4-ORC link, no one has attempted to estimate 
what portion ofG4s associated with ORCs is represented by 
Ll-derived G4s. With so many unknowns, a landmark for 
future investigations could be what we can see in the nu- 
cleus. The abundance of Lis among MARs and preferential 
colocalization of origins and MARs suggest that chromatin 
is organized in such a way that many Lis likely serve as 
MARs and ORC binding sites at the same time. 

I would urge the author to reassess the review and 
re-balance the description and interpretation of the experi- 
mentation. This should be married with a considerable 
reduction (50%) in the length of the review to allow it to be 
accessible to as wide a community of scientists as possible. 

Response: / appreciate the concerns with respect to 
making the manuscript more accessible to as wide a 
community as possible; however, I believe the paper will 
lose its value to specialists as well as the integral view if 
the experimental or interpretative components are so 
drastically pruned. 
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This is a multidisciplinary work that integrates experi- 
mental data from a number of fields. Therefore, some 
introductory information regarding LI biology, replication 
timing etc., are worth inclusion so the paper is accessible 
to a multidisciplinary readership. Moreover, retaining the 
experimental data that might be considered as non-key 
facts is important. From the perspective of the introduced 
concepts, the whole picture that emerges from the integra- 
tion of the key and non-key facts is a more convincing piece 
of information than several findings standing alone. This is 
important because the concepts and interpretation of cer- 
tain experimental data are provocative. 

This review is not in a narrative style. As mentioned 
above, this is an inductive paper that purports a consider- 
able interpretative component. The interpretative portion 
of the manuscript is as important as the experimental with 
respect to integrating the experimental material, introdu- 
cing alternative explanations, pointing out issues pertinent 
to the current LI paradigm, proposing conceptual changes, 
and examining how the available data fit the model. The 
discussion of potential links between the phenomena that 
have never been thought linked opens new avenues for re- 
search. I believe there is some value in this intellectual 
contribution. 

In recognition of the length issue, I have deleted a few de- 
tails such as the names of genes that changed their expres- 
sion levels in response to the downregulation of Lis in A-375 
melanoma cells and the concentrations of the RT inhibitors 
used to reprogram cancer cell lines and to assess their effects 
on retrotransposition of Lis. Some redrafting has also been 
done to make some paragraphs more concise. 

Quality of written English: Acceptable 

Reviewer 2: Dr. I King Jordan, Georgia Institute of 
Technology, Atlanta, USA 

The manuscript on the functional significance of (poten- 
tially) non-canonical LI expression and replication by 
Ekaterina Belan is a provocative mix of a review article 
and a hypothesis paper. The author extensively reviews 
current experimental evidence on the role of LI reverse 
transcription in early embryos and cancer in light of re- 
cent findings on genome regulation, organization and epi- 
genetics. A key to understanding the authors approach is 
the desire to explore novel functional roles for Lis that do 
not fit within the current paradigm of LI biology, which 
focuses mainly on retrotransposition dynamics and host 
genome mechanisms for the repression of transposition. 

The search for a functional role of Lis rests on the au- 
thors notion that since the main role of Lis is not the 
introduction of genomic variation "it is logical to assume 
that an important function (or functions) of Lis remains 
to be discovered." While this kind of teleological think- 
ing is tempting, one does not need to invoke a direct 



function of Lis to explain their existence and abundance 
in the genome (or their regulatory anomalies for that 
matter). As is held by the selfish DNA theory, the exist- 
ence of such elements can be explained solely by their 
ability to out-replicate the genomes in which they reside. 

Response: This is a very good point. I have revised the 
paragraph to include consideration of the evolutionary 
aspect. 

Having said that, once having established themselves 
in their hosts' genomes, it is almost certainly the case 
that elements of this kind can have a profound effect on 
genome function. Accordingly, what the author refers to 
as the current 'retrotransposition-centered paradigm' of 
LI biology may indeed lead to interpretations of experi- 
mental evidence that are markedly different from those 
offered in this manuscript. As such, the alternative hy- 
potheses and views proposed here do seem to cover new 
ground, are thought provoking, lead to testable predic- 
tions (to some extent), and are thus worthy of publica- 
tion in Biology Direct. 

Some of the interpretations of LI experimental data 
presented here are likely to be controversial, particularly to 
the extent that they differ from interpretations offered by 
the authors of the studies that generated the data. Thus, the 
paper has the potential to generate a substantive response 
and a potentially interesting discussion in the field and/or 
the literature. To her credit, the author does provide spe- 
cific experimental tests of her models as they relate to the 
occurrence of non-canonical LI DNA replication and the 
role of full-length LI expression in genome regulation. 

Finally, it is worth noting that the topics covered in this 
review, and in particular the experimental tests proposed, 
could have biomedical relevance with respect to the link 
between LI reverse transcriptase-dependent DNA synthe- 
sis and cancer and/or stem cells. A better understanding 
of this phenomenon could hold promise for the develop- 
ment of LI related anti-cancer therapies and/or novel 
methods for the reprogramming of differentiated cells to 
pluripotent stem cells. 

Quality of written English: Acceptable 

Reviewer 3: Dr. Panayiotis (Takis) Benos, University of 
Pittsburgh, Pittsburgh, USA 

This reviewer provided no comments for publication. 
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LINE: Long interspersed nuclear element; LINE-1 (LI ): Long interspersed 
nuclear element-1; FL-L1: Full-length L1; RT: Reverse transcriptase; 
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EN: Endonuclease; RNP: Ribonucleoprotein; cDNA: Complementary DNA; 
L1Hs: Subfamily of human-specific (from Homo sapiens) L1 elements (also 
known as L1PA1); L1PA2 L1PA3, L1PA4, L1PA6, L1PA7: Subfamilies of 
primate-specific L1 elements; Ta-1: Transpositionally active subfamily of 
human L1 elements (also known as L1Hs-Ta1); UTR: Untranslated region; 
ORF: Open reading frame; 0RF1: Open reading frame 1; 0RF2: Open reading 
frame 2; 0RF1p: Open reading frame 1 protein; 0RF2p: Open reading frame 
2 protein; bp: Base pair(s); kb: Kilobase pairs; Mb: Megabase pairs; 
siRNA: Small interfering RNA; RNAi: RNA interference; S/MAR: Scaffold/matrix 
attachment region; SAR: Scaffold attachment region; MAR: Matrix attachment 
region; ERV: Endogenous retrovirus; HERV-K: Human endogenous retrovirus 
family; MuERV-L: Murine endogenous retrovirus-like element; AML: Acute 
myeloid leukemia; ICM: Inner cell mass; ESC: Embryonic stem cell; 
hESC: Human embryonic stem cell; mESC: Mouse embryonic stem cell; 
iPSC: Induced pluripotent stem cell; EpiSC: Stem cell derived from the 
epiblast; NPC: Neural precursor cell; EtoL: Replication timing change from 
early to late S; LtoE: Replication timing change from late to early S; Xa: Active 
X chromosome; Xi: Inactive X chromosome; BrdU: 5-Bromodeoxyuridine; 
IdU: lododeoxyuridine; CldU: Chlorodeoxyuridine; HCG: Human chorionic 
gonadotropin; qPCR: Quantitative real-time polymerase chain reaction; 
INM: Inner nuclear membrane; ORC: Origin recognition complex; ORCA: 
ORC-associated protein; HP1: Heterochromatin protein 1; G4: G-quadruplex; 
H3K9me3: Histone H3 trimethylated at lysine 9; H3K9ac: Histone H3 
acetylated at lysine 9; TDP: Timing decision point; 3D: Three-dimensional; 
wt: Wild type; R loop: RNA-DNA displacement loop; ssDNA: Single-stranded 
DNA; dsDNA: Double-stranded DNA; FISH: Fluorescence in situ hybridization; 
Chlp-seq: Chromatin immunoprecipitation followed by high-throughput 
DNA sequencing; HIV: Human immunodeficiency virus. 
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